spencermountain / dumpster-diveLinks
roll a wikipedia dump into mongo
☆251Updated 3 weeks ago
Alternatives and similar repositories for dumpster-dive
Users that are interested in dumpster-dive are comparing it to the libraries listed below
Sorting:
- a pretty-committed wikipedia markup parser☆852Updated 2 months ago
- 🎀 JavaScript API for spaCy with Python REST API☆200Updated 2 years ago
- Expose Spacy nlp text parsing to Nodejs (and other languages) via socketIO☆228Updated 3 years ago
- English NLP for Node.js and the browser.☆87Updated 2 years ago
- varied english texts for modern NLP testing☆77Updated 3 years ago
- FastText for Node.js☆199Updated 2 years ago
- ⚙️ [Processor] A better English POS tagger written in JavaScript☆56Updated 8 years ago
- WordNet in JSON format.☆97Updated 5 years ago
- command-line tool to extract taxonomies from Wikidata☆129Updated 6 years ago
- JS utils functions to query a Wikibase instance and simplify its results☆342Updated this week
- text mining utilities for Node.js☆142Updated 3 years ago
- spaCy REST API, wrapped in a Docker container.☆268Updated 3 years ago
- This project represents the 300-dimensional word vectors from word2vec as JSON.☆129Updated 9 years ago
- tools for working with Princeton's lexical database WordNet☆74Updated 7 years ago
- Markov Chain combined with word vector embedding (word2vec) and part-of-speech tagging, for context-aware text generation. License: MIT☆99Updated 8 years ago
- Filter and format a newline-delimited JSON stream of Wikibase entities☆105Updated 5 months ago
- Word embeddings for the web☆28Updated 3 years ago
- displaCy.js: An open-source NLP visualiser for the modern web☆345Updated 7 years ago
- TextRank algorithm implementation in Javascript☆40Updated 10 years ago
- English lexicon useful in NLP/NLU☆17Updated 2 years ago
- Creates a Neo4j graph of Wikipedia links.☆258Updated 7 years ago
- Multilingual tokenizer that automatically tags each token with its type☆65Updated 2 years ago
- Text summarization using Lexrank☆54Updated 7 years ago
- Json Wikipedia, contains code to convert the Wikipedia xml dump into a json/avro dump☆255Updated 2 years ago
- A modular annotation system that supports complex, interactive annotation graphs embedded on top of sequences of text.☆98Updated 4 years ago
- A client for the Stanford Part of Speech Tagger XMLRPC server.☆72Updated 8 years ago
- RosaeNLG is a Natural Language Generation library for node.js and browser rendering, based on the Pug template engine.☆106Updated last year
- an opinionated assembly of wordnet for javascript☆56Updated 8 years ago
- A Wordnet API in pure JavaScript☆110Updated 3 years ago
- LanguageCrunch NLP server docker image☆285Updated 3 years ago