clarkgrubb / data-tools
File format conversion tools
☆291Updated 4 years ago
Alternatives and similar repositories for data-tools:
Users that are interested in data-tools are comparing it to the libraries listed below
- Randomly sample lines from a csv, tsv, or other line-based data file☆125Updated 10 years ago
- Automatically exported from code.google.com/p/crush-tools☆150Updated 9 years ago
- A Python library for creating fast, repeatable and self-documenting data analysis pipelines.☆238Updated 2 months ago
- Convert an XML input to a JSON output, using xml-mapping☆162Updated 8 years ago
- The (large) data files needed for the Data Science Toolkit project☆232Updated 11 years ago
- Qualitative visualization of the data types of CSV files☆257Updated 10 years ago
- Analyzes a CSV file and generates database table schema, all within the browser☆315Updated 9 years ago
- Command-line tool for manipulating CSV data☆75Updated 7 years ago
- Schemas to convert common fixed-width file formats into CSV using in2csv.☆124Updated 3 years ago
- Convert text from a file or from stdin into SQL table and query it instantly. Uses sqlite as backend. The idea is to make SQL into a tool…☆287Updated 5 years ago
- Like awk, but with SQL and table joins☆313Updated 5 months ago
- A proofreader for your data☆693Updated 2 years ago
- Extract tables from PDF files☆356Updated 8 years ago
- Code + Jupyter notebook for analyzing and visualizing Reddit Data quickly and easily☆113Updated 9 years ago
- Portland Python Meetup March 2015☆40Updated 10 years ago
- A (comprehensive) collection of open source tools used by the data community.☆51Updated 9 years ago
- My IPython startup files.☆109Updated 10 years ago
- Analyze the structure and dynamics of an open source project's developer community, using graph algorithms, etc.☆58Updated 4 years ago
- A desktop CSV editor for data publishers☆282Updated last year
- A polite, minimal interface for sending python objects to and from Amazon S3.☆57Updated 9 years ago
- Sample repo for luigi tasks & config☆36Updated 8 years ago
- SQLCell is a magic function for the Jupyter Notebook that executes raw, parallel, parameterized SQL queries with the ability to accept Py…☆151Updated 2 years ago
- Docker images for data science from Wise.io☆50Updated 9 years ago
- Material for some talks I have given☆62Updated 7 months ago
- An external PySpark module that works like R's read.csv or Panda's read_csv, with automatic type inference and null value handling. Parse…☆90Updated 9 years ago
- Utils around luigi.☆66Updated 4 years ago
- Transform nested JSON data into tabular data in the shell.☆288Updated 7 years ago
- Enables common unix utlities like cut, awk, wc, head to work correctly with csv data containing delimiters and newlines☆446Updated last year
- GNU-alike tools for parsing RFC 4180 CSVs at high speed.☆102Updated last year
- Run IPython, Pattern, NLTK, Pandas, NumPy, SciPy, Numba, Biopython inside Docker☆47Updated 10 years ago