mattbierner / urban-dictionary-entry-collector
Script used to collect entry data from Urban Dictionary
☆33Updated 8 years ago
Related projects: ⓘ
- ☆95Updated 3 years ago
- WordNet in JSON format.☆90Updated 4 years ago
- Wiktionary parser tool for many language editions.☆53Updated 2 years ago
- Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.☆41Updated 6 years ago
- Python bindings to the dutch NLP tool Frog (pos tagger, lemmatiser, NER tagger, morphological analysis, shallow parser, dependency parser…☆47Updated last week
- An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed…☆143Updated last week
- My implementation of Explicit Semantic Analysis (ESA) library that we used at KMi, Open University to produce our submission at the NTCIR…☆36Updated 8 years ago
- A tool for calculation semantic similarity between words from a text corpus based on lexico-syntactic patterns.☆28Updated 8 years ago
- Machine translation for the real world☆23Updated 4 years ago
- Shell scripts to assist downloading & processing the Google n-grams corpora☆14Updated 7 years ago
- High-coverage and high-precision lexica of terms annotated with emotion scores for English and Italian.☆152Updated 5 years ago
- WARC and ARC indexing and discovery tools.☆114Updated last month
- An intelligent reading agent that understands text and translates it into Wikidata statements.☆112Updated 8 years ago
- Socially-Equitable Language Identification☆78Updated last year
- Normalizes lexically ill-formed text to its most likely clean text, e.g. "c u thr 2nite!" -> "see you there tonight!".☆63Updated 8 years ago
- Open-source tools for morphological tagging, segmentation and stemming.☆41Updated 5 years ago
- AMALGrAM, an English supersense tagger written in Python☆33Updated 7 years ago
- http://www.ark.cs.cmu.edu/ARKref/☆32Updated 10 years ago
- Performs multi document summarization. Includes a method to generate summaries: The method uses a sentence importance score calculator ba…☆37Updated 11 years ago
- Framework for creating and accessing UBY resources – sense-linked lexical resources in standard UBY-LMF format☆22Updated 6 years ago
- *Deprecated* A fast and accurate part-of-speech tagger for TextBlob.☆104Updated 8 years ago
- The SRL-based Open IE extractor. A principal component of Open IE 4.0.☆19Updated 6 years ago
- WordNet-LMF formats☆20Updated last week
- Thot toolkit for statistical machine translation☆50Updated last year
- Polyglot is a language identifier for detecting text documents containing text written in more than one language, and for identifying the…☆32Updated 8 years ago
- Regex like pattern tree matching but on sentence's tree instead of Strings☆42Updated 6 years ago
- Extract a plain text corpus from MediaWiki XML dumps, such as Wikipedia.☆132Updated 5 years ago
- A Utility Library for Wikipedia dumps☆33Updated 7 years ago
- Index URLs in Common Crawl☆192Updated 7 years ago
- Maps clauses from a text corpus onto the metrical structure of a poem☆17Updated 9 years ago