spencermountain / dumpster-dive
roll a wikipedia dump into mongo
☆241Updated 6 months ago
Alternatives and similar repositories for dumpster-dive:
Users that are interested in dumpster-dive are comparing it to the libraries listed below
- a pretty-committed wikipedia markup parser☆784Updated this week
- text mining utilities for Node.js☆142Updated last year
- 🎀 JavaScript API for spaCy with Python REST API☆195Updated last year
- ⚙️ [Processor] A better English POS tagger written in JavaScript☆53Updated 7 years ago
- Expose Spacy nlp text parsing to Nodejs (and other languages) via socketIO☆225Updated 2 years ago
- command-line tool to extract taxonomies from Wikidata☆125Updated 5 years ago
- WordNet in JSON format.☆91Updated 4 years ago
- LDA topic modeling for node.js☆293Updated 4 months ago
- varied english texts for modern NLP testing☆75Updated 2 years ago
- FastText for Node.js☆196Updated last year
- Filter and format a newline-delimited JSON stream of Wikibase entities☆98Updated 3 months ago
- Sentence Boundary Detection in javascript for node. http://tessmore.github.io/sbd/☆208Updated last year
- NLP Functions for amplifying negations, managing elisions, creating ngrams, stems, phonetic codes to tokens and more.☆124Updated 10 months ago
- spaCy REST API, wrapped in a Docker container.☆266Updated 2 years ago
- Scripts and microservice to feed an ElasticSearch with Wikidata and Inventaire entities, and keep those up-to-date☆41Updated 4 years ago
- plugin to extract keywords and key-phrases☆332Updated 2 months ago
- Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl,…☆75Updated last month
- A module for node.js and the browser that takes in text and strips it of stopwords☆239Updated 2 weeks ago
- English NLP for Node.js and the browser.☆87Updated last year
- JS utils functions to query a Wikibase instance and simplify its results☆327Updated 3 months ago
- displaCy.js: An open-source NLP visualiser for the modern web☆343Updated 6 years ago
- CLDR text segmentation for JavaScript☆38Updated 8 months ago
- Visualize Wikidata items using d3.js☆192Updated 5 months ago
- WordNet Database files (previously WNdb)☆215Updated 5 years ago
- Index Common Crawl archives in tabular format☆109Updated 2 months ago
- an opinionated assembly of wordnet for javascript☆56Updated 7 years ago
- DBpedia Spotlight is a tool for automatically annotating mentions of DBpedia resources in text. Improving Efficiency and Accuracy in Mult…☆179Updated last year
- Node module wrapper for WordNet dictionary.☆50Updated 2 years ago
- List of emoji rated for valence☆123Updated 2 years ago