clarkgrubb / data-tools
File format conversion tools
☆289Updated 3 years ago
Related projects: ⓘ
- A Python library for creating fast, repeatable and self-documenting data analysis pipelines.☆236Updated 6 months ago
- Automatically exported from code.google.com/p/crush-tools☆150Updated 8 years ago
- Randomly sample lines from a csv, tsv, or other line-based data file☆122Updated 9 years ago
- Schemas to convert common fixed-width file formats into CSV using in2csv.☆123Updated 3 years ago
- Convert an XML input to a JSON output, using xml-mapping☆161Updated 8 years ago
- Qualitative visualization of the data types of CSV files☆254Updated 10 years ago
- Code + Jupyter notebook for analyzing and visualizing Reddit Data quickly and easily☆112Updated 8 years ago
- commandline tools for slicing and dicing JSON records.☆300Updated 4 years ago
- Like awk but with SQL and table joins☆310Updated 4 months ago
- Command-line tool for manipulating CSV data☆75Updated 6 years ago
- Tools for parsing messy tabular data. This is now superseded by https://github.com/frictionlessdata/tabulator-py☆389Updated last year
- Extract tables from PDF files☆354Updated 8 years ago
- Data workflow tool, like a "Make for data"☆1,481Updated 2 years ago
- Analyzes a CSV file and generates database table schema, all within the browser☆317Updated 8 years ago
- Convert text from a file or from stdin into SQL table and query it instantly. Uses sqlite as backend. The idea is to make SQL into a tool…☆284Updated 4 years ago
- Tools for generating CSV and other flat versions of the structured data☆103Updated this week
- A Python data analysis library that is optimized for humans instead of machines.☆1,170Updated last month
- Utils around luigi.☆65Updated 3 years ago
- A framework (comand line tool + libraries) for creating flexible compute pipelines☆55Updated 3 years ago
- The (large) data files needed for the Data Science Toolkit project☆224Updated 11 years ago
- ☆85Updated 6 years ago
- A Topic Modeling toolbox☆93Updated 8 years ago
- Generate CSV files with fake data from the command line☆66Updated 5 years ago
- A converter that generates a bash one-liner from an SQL Select query (no DB necessary)☆287Updated 8 years ago
- SQLCell is a magic function for the Jupyter Notebook that executes raw, parallel, parameterized SQL queries with the ability to accept Py…☆150Updated 2 years ago
- Introduction to Statistics☆231Updated 8 years ago
- Refinery - A locally deployable open-source web platform for analysis of large document collections☆102Updated 8 years ago
- Enables common unix utlities like cut, awk, wc, head to work correctly with csv data containing delimiters and newlines☆445Updated last year
- Sample repo for luigi tasks & config☆36Updated 8 years ago
- Transform nested JSON data into tabular data in the shell.☆280Updated 6 years ago