mattbierner / urban-dictionary-entry-collectorLinks
Script used to collect entry data from Urban Dictionary
☆33Updated 9 years ago
Alternatives and similar repositories for urban-dictionary-entry-collector
Users that are interested in urban-dictionary-entry-collector are comparing it to the libraries listed below
Sorting:
- Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.☆46Updated 7 years ago
- Python scripts for retrieving CSV data from the Google Ngram Viewer and plotting it in XKCD style. The Python script for retrieving ngram…☆254Updated 4 years ago
- WordNet in JSON format.☆91Updated 4 years ago
- ☆97Updated 3 years ago
- Wikipedia API wrapper for humans and elk. (en.wikipedia.org/w/api.php, get it?)☆36Updated 10 years ago
- An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed…☆150Updated 3 weeks ago
- Socially-Equitable Language Identification☆78Updated 2 years ago
- An intelligent reading agent that understands text and translates it into Wikidata statements.☆116Updated 9 years ago
- Wiktionary parser tool for many language editions.☆54Updated 2 years ago
- Python library for reading and writing warc files☆243Updated 3 years ago
- My implementation of Explicit Semantic Analysis (ESA) library that we used at KMi, Open University to produce our submission at the NTCIR…☆36Updated 9 years ago
- English Dependency Relationship Extractor☆85Updated 6 months ago
- ANNIS is an open source, versatile web browser-based search and visualization architecture for complex multilevel linguistic corpora with…☆75Updated last month
- Tarsqi Toolkit☆26Updated 4 years ago
- Json Wikipedia, contains code to convert the Wikipedia xml dump into a json dump. Questions? https://gitter.im/idio-opensource/Lobby☆17Updated 3 years ago
- Experiments to help discussion on Wikipedia talk pages☆66Updated 3 weeks ago
- M-ATOLL: A Framework for the Lexicalization of Ontologies in Multiple Languages☆10Updated 8 years ago
- Scripts and microservice to feed an ElasticSearch with Wikidata and Inventaire entities, and keep those up-to-date☆41Updated 4 years ago
- An open source toolkit for mining Wikipedia☆129Updated 6 years ago
- *Deprecated* A fast and accurate part-of-speech tagger for TextBlob.☆102Updated 9 years ago
- Deployment of pywb as a CommonCrawl Index Server☆21Updated 7 years ago
- Deutsch Language Tool Kit☆12Updated 9 years ago
- a collection of functions that measure the readability of a given body of text☆195Updated 7 years ago
- Bilingual sentence aligner (Gale & Church, 1993)☆14Updated 6 years ago
- DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplate…☆52Updated 5 years ago
- Multi Tier Annotation Search☆26Updated 4 years ago
- ☆32Updated 4 years ago
- Natural language processing pipeline for book-length documents (archival Java version; for current Python version, see: https://github.co…☆315Updated 3 years ago
- Semanticizest: dump parser and client☆20Updated 9 years ago
- A Java UIMA-based toolbox for multilingual and efficient terminology extraction an multilingual term alignment☆40Updated 7 years ago