pushshift / Parallel-NDJSON-ReaderLinks
Parallel NDJSON Reader for Python
☆17Updated 6 years ago
Alternatives and similar repositories for Parallel-NDJSON-Reader
Users that are interested in Parallel-NDJSON-Reader are comparing it to the libraries listed below
Sorting:
- Text Thresher crowd sourced text annotator☆17Updated 8 years ago
- ☆76Updated this week
- Tokenizer for Twitter and Reddit data☆45Updated 6 years ago
- The core of sunlightlabs' Data Commons project. Includes the Transparency Data site and the APIs that power TransparencyData.com and Infl…☆38Updated 9 years ago
- A library that enables you to easily parse and transform ORCID metadata between XML, JSON and Java objects☆20Updated 4 years ago
- A simple command line interface to the datamade/dedupe library.☆43Updated 3 years ago
- Daily refreshed data on representation certification and unfair labor cases from nlrb.gov☆21Updated 2 months ago
- Performs unique entity estimation corresponding to Chen, Shrivastava, Steorts (2018).☆15Updated 6 years ago
- The documentation and scripts for the Local News Dataset☆25Updated 3 years ago
- Ensemble topic modelling with pLSA☆114Updated 4 years ago
- An implementation of latent Dirichlet allocation in javascript☆185Updated 3 years ago
- ☆23Updated last year
- A multi-modal Twitter dataset with 7.6M tweets and 25.6M retweets related to voter fraud claims.☆53Updated 4 years ago
- Utilities for retrieving whitehouse.gov transcripts and matching news quotes to them☆16Updated 11 years ago
- A browser user interface for manual labeling of record pairs.☆48Updated 2 years ago
- Python wrapper for a C++ Double Metaphone☆15Updated 3 weeks ago
- Classify names by gender, U.S. ethnicity, or leaf nationality☆19Updated 7 years ago
- Fast, flexible name matching for large datasets☆71Updated 5 months ago
- Command line tool for manipulating and analyzing text☆29Updated 3 years ago
- Read compressed NDJSON .zst files easily☆35Updated 3 years ago
- Collecting thoughts about data versioning☆108Updated 6 years ago
- Investigative tool for extracting relevant areas from many documents☆14Updated 10 years ago
- The RICardo dataset compiles trade statistics sources of international trade bilateral flows of the 19th century.☆19Updated last month
- Notebooks configured to be run with Binder, usually found on my blog.☆42Updated 2 years ago
- Extract place names from a URL or text, and add context to those names -- for example distinguishing between a country, region or city.☆62Updated 9 years ago
- A lightweight end-to-end NLP and visualization platform to make WordStream.☆44Updated 2 years ago
- Interactive Network Graph Visualization for NDTV-generate graphs using D3 animation☆18Updated 10 years ago
- Tools to download and process name data from various sources.☆91Updated 12 years ago
- pysmap is a high level interface for working with twitter data.☆21Updated 5 years ago
- Data and scripts relating to the publishing of the House expenditure reports, and hopefully the Senate's in future.☆24Updated 5 years ago