clarkgrubb / data-toolsLinks
File format conversion tools
☆291Updated 2 months ago
Alternatives and similar repositories for data-tools
Users that are interested in data-tools are comparing it to the libraries listed below
Sorting:
- Convert an XML input to a JSON output, using xml-mapping☆162Updated 9 years ago
- Schemas to convert common fixed-width file formats into CSV using in2csv.☆125Updated 4 years ago
- A Python library for creating fast, repeatable and self-documenting data analysis pipelines.☆242Updated last week
- Randomly sample lines from a csv, tsv, or other line-based data file☆125Updated 10 years ago
- Code + Jupyter notebook for analyzing and visualizing Reddit Data quickly and easily☆112Updated 10 years ago
- Python library and command line tool for converting data from one format to another☆99Updated 5 years ago
- Tools for parsing messy tabular data. This is now superseded by https://github.com/frictionlessdata/tabulator-py☆390Updated 2 years ago
- Extract tables from PDF files☆358Updated 9 years ago
- Tool for visual exploration of complex data.☆192Updated 7 years ago
- A (comprehensive) collection of open source tools used by the data community.☆52Updated 9 years ago
- Qualitative visualization of the data types of CSV files☆258Updated 11 years ago
- Enables common unix utlities like cut, awk, wc, head to work correctly with csv data containing delimiters and newlines☆450Updated 2 years ago
- Generate a diff between two tabular datasets expressed in CSV files.☆131Updated 4 years ago
- A framework (comand line tool + libraries) for creating flexible compute pipelines☆56Updated 4 years ago
- Utils around luigi.☆66Updated 2 months ago
- Command-line tool for manipulating CSV data☆74Updated 7 years ago
- Material for some talks I have given☆62Updated last year
- Sample repo for luigi tasks & config☆36Updated 9 years ago
- JSON -> Relational DB Column Types☆63Updated 2 years ago
- A complete environment for busy polyglot data scientists☆472Updated 4 years ago
- A Python data analysis library that is optimized for humans instead of machines.☆1,194Updated this week
- ☆84Updated 7 years ago
- A wrapper around gitpython to produce pandas dataframes for analysis☆191Updated 3 months ago
- Tools for text tokenization and encoding☆84Updated 4 years ago
- Docker images for data science from Wise.io☆51Updated 9 years ago
- Tribe extracts a network from an email mbox and writes it to a graphml file for visualization and analysis.☆79Updated 2 years ago
- Multidimensional data explorer and visualization tool.☆56Updated 8 years ago
- workflow support for reproducible deduplication and merging☆16Updated 2 years ago
- Data Pipes for CSV☆116Updated 2 years ago
- enable rapid iteration and development of complex data pipelines☆29Updated 7 months ago