Jin Park

Obtaining, Scrubbing, and Exploring Data at the Command Line

Yesterday I went to a meetup called Obtaining, Scrubbing, and Exploring Data at the Command Line

It was just a mix of old and new things. The standard unix command line tools (cat, awk, grep, sed, less, head, tail, etc) and a mix of new tools (csvkit - a bunch of small command line tools for csv, jq - tools for handling json on the command line, xml2json and json2csv and others)

The speaker also got into a bit of his own tools. From something big like being able to call R from the command line to small bash functions that uses http://explainshell.com to print out what a shell function does on the command line.

He talked about creating “data science toolkits” and gave some examples of packaging and creating enviroments that can be easily moved around. Here is his example.

Here are the slides from the talk.

data science command line meetup

jinpark