petewarden / dstkdataLinks
The (large) data files needed for the Data Science Toolkit project
☆233Updated 12 years ago
Alternatives and similar repositories for dstkdata
Users that are interested in dstkdata are comparing it to the libraries listed below
Sorting:
- Automatically exported from code.google.com/p/crush-tools☆150Updated 9 years ago
- Like awk, but with SQL and table joins☆315Updated last year
- Num: number utilities for mathematics☆134Updated 2 years ago
- commandline tools for slicing and dicing JSON records.☆305Updated 5 years ago
- File format conversion tools☆292Updated 5 months ago
- Enables common unix utlities like cut, awk, wc, head to work correctly with csv data containing delimiters and newlines☆452Updated 2 years ago
- Elastic tabstops for Rust.☆271Updated 4 months ago
- Transform nested JSON data into tabular data in the shell.☆292Updated 7 years ago
- Convert text from a file or from stdin into SQL table and query it instantly. Uses sqlite as backend. The idea is to make SQL into a tool…☆285Updated 5 years ago
- Remove bad records from a CSV file and normalize☆57Updated 3 years ago
- Quick and dirty statistics tool for the UNIX pipeline☆61Updated 9 years ago
- The tool I used to write my book, Effective Python.☆85Updated 8 years ago
- A system to programmatically run data pipelines☆226Updated 2 months ago
- VisiData interface for databases☆68Updated 2 years ago
- Dataframe structure and operations in Rust☆145Updated 7 years ago
- GNU-alike tools for parsing RFC 4180 CSVs at high speed.☆108Updated 5 months ago
- Query your CSV files with SQL☆215Updated 2 months ago
- An efficient way to filter duplicate lines from input, à la uniq.☆222Updated last month
- Say "ni" to data of any size☆86Updated 2 months ago
- Convert a CSV to a parquet file.☆64Updated 3 years ago
- ☆117Updated 2 years ago
- Convert CSV files to Apache Parquet.☆79Updated 3 years ago
- Simple Python scripts to download all Hacker News submissions and comments and store them in a PostgreSQL database.☆127Updated 8 years ago
- Search lots of data sets for spurious correlations☆64Updated 3 years ago
- A Rust DataFrame implementation, built on Apache Arrow☆280Updated 5 years ago
- A utility for sorting really big files. http://kmkeen.com/gz-sort/☆94Updated 7 years ago
- A Python data analysis library that is optimized for humans instead of machines.☆1,194Updated 3 weeks ago
- Data workflow tool, like a "Make for data"☆1,483Updated 3 years ago
- Distill a JSON document into a collection of paths both for 'jq' and 'xpath'☆107Updated 4 years ago
- Ben Franklin-esque Schedule in LaTeX☆16Updated 9 years ago