spencermountain / dumpster-diveLinks
roll a wikipedia dump into mongo
☆245Updated last year
Alternatives and similar repositories for dumpster-dive
Users that are interested in dumpster-dive are comparing it to the libraries listed below
Sorting:
- a pretty-committed wikipedia markup parser☆824Updated last month
- 🎀 JavaScript API for spaCy with Python REST API☆196Updated last year
- ⚙️ [Processor] A better English POS tagger written in JavaScript☆55Updated 8 years ago
- Expose Spacy nlp text parsing to Nodejs (and other languages) via socketIO☆226Updated 2 years ago
- FastText for Node.js☆196Updated 2 years ago
- WordNet in JSON format.☆91Updated 5 years ago
- Multilingual tokenizer that automatically tags each token with its type☆62Updated 2 years ago
- varied english texts for modern NLP testing☆76Updated 3 years ago
- JS utils functions to query a Wikibase instance and simplify its results☆335Updated last week
- spaCy REST API, wrapped in a Docker container.☆267Updated 2 years ago
- command-line tool to extract taxonomies from Wikidata☆128Updated 6 years ago
- Filter and format a newline-delimited JSON stream of Wikibase entities☆98Updated 2 months ago
- NLP Functions for amplifying negations, managing elisions, creating ngrams, stems, phonetic codes to tokens and more.☆131Updated last year
- Creates a Neo4j graph of Wikipedia links.☆256Updated 7 years ago
- ☆13Updated 8 years ago
- One trick pony NLP library for extracting keywords from HTML documents☆18Updated 9 years ago
- displaCy.js: An open-source NLP visualiser for the modern web☆345Updated 7 years ago
- Word embeddings for the web☆28Updated 2 years ago
- an opinionated assembly of wordnet for javascript☆56Updated 8 years ago
- Scripts and microservice to feed an ElasticSearch with Wikidata and Inventaire entities, and keep those up-to-date☆41Updated 4 years ago
- A lightweight JavaScript client library for the Wikimedia Pageviews API for Wikipedia and various of its sister projects for Node.js and …☆27Updated 4 years ago
- LDA-Based Topic Modelling in Javascript☆44Updated 11 years ago
- A command-line tool for using CommonCrawl Index API at http://index.commoncrawl.org/☆196Updated 6 years ago
- TextRank algorithm implementation in Javascript☆41Updated 10 years ago
- Json Wikipedia, contains code to convert the Wikipedia xml dump into a json/avro dump☆254Updated last year
- AmbiverseNLU: A Natural Language Understanding suite by Max Planck Institute for Informatics☆212Updated last year
- This project represents the 300-dimensional word vectors from word2vec as JSON.☆127Updated 8 years ago
- A modular annotation system that supports complex, interactive annotation graphs embedded on top of sequences of text.☆96Updated 3 years ago
- A client for the Stanford Part of Speech Tagger XMLRPC server.☆72Updated 8 years ago
- displaCy-ent.js: An open-source named entity visualiser for the modern web☆198Updated 7 years ago