stressosaurus / raw-data-google-ngram
This will download and process the Google Ngram data.
☆16Updated 2 years ago
Alternatives and similar repositories for raw-data-google-ngram
Users that are interested in raw-data-google-ngram are comparing it to the libraries listed below
Sorting:
- Scripts and tools for doing unsupervised acceptability prediction.☆15Updated 2 years ago
- STREUSLE: a corpus with comprehensive lexical semantic annotation (multiword expressions, supersenses)☆66Updated 2 years ago
- ParCourE - Parallel Corpus Explorer☆12Updated 3 years ago
- ☆17Updated 4 years ago
- linguistic converter / merging tool for multi-level annotated corpora. graph-based (using Python and NetworkX).☆51Updated 2 years ago
- Repository for rstWeb, a browser based annotation interface for Rhetorical Structure Theory☆43Updated 6 months ago
- An easy-to-use library to linguistically compare one sentence and its words to another, in the same language or a different one. For inst…☆22Updated 3 years ago
- Alignment and annotation for comparable documents.☆22Updated 6 years ago
- Repository for the Georgetown University Multilayer Corpus (GUM)☆94Updated this week
- CONLL-U to Pandas DataFrame☆31Updated 7 years ago
- Distribution of word meanings in Wikipedia for English, Italian, French, German and Spanish.☆10Updated 4 years ago
- The Universal Decompositional Semantics (UDS) dataset and the Decomp toolkit☆57Updated last year
- Python Multilingual Ucrel Semantic Analysis System☆32Updated 8 months ago
- Transform TMX to text☆28Updated 2 years ago
- Format conversion and graphical representation of [Universal Dependencies](http://universaldependencies.org) trees.☆12Updated 8 months ago
- Lexicons for the Multilingual UCREL Semantic Analysis System☆41Updated last year
- MAGPIE: A sense-annotated corpus of potentially idiomatic expressions☆27Updated 4 years ago
- Scripts to evaluate scoped meaning representations☆19Updated 2 years ago
- ☆72Updated last month
- The Arborator software is aimed at collaboratively annotating dependency corpora.☆26Updated 5 years ago
- Tool to fix bitexts and tag near-duplicates for removal☆30Updated 3 months ago
- [ACL 2021, Findings] Cognate Prediction Per Machine Translation☆10Updated 2 years ago
- Python version for Doug Biber's Multidimensional Analysis (MDA)☆30Updated 2 months ago
- List of corpora annotated for coreference for different languages☆17Updated 9 months ago
- Robust Cross-lingual Embeddings from Parallel Sentences☆22Updated 4 years ago
- Linguistic and stylistic complexity measures for (literary) texts☆81Updated last year
- Word Sense Induction with BERT MLM☆28Updated last year
- Efficient Low-Memory Aligner☆143Updated 4 months ago
- OpusFilter - Parallel corpus processing toolkit☆104Updated last month
- Bilingual sentence similarity classifier using Tensorflow☆21Updated 5 years ago