ngrams-dev / general
NGRAMS is a search engine for the Google Books Ngram Dataset. This repository contains documentation, discussions, announcements, and issues.
☆14Updated last year
Related projects ⓘ
Alternatives and complementary repositories for general
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆52Updated 3 years ago
- an interactive visual tool for exploring ideologies of political parties from up to date WikiData, using SPARQL, D3js, and PixiJS☆15Updated 3 years ago
- linguistics tree drawing to SVG in python, aimed at Jupyter☆62Updated 3 months ago
- an approximate string matching or fuzzy-matching system for spelling correction, normalisation or post-OCR correction☆31Updated last month
- Multi Tier Annotation Search☆12Updated 6 months ago
- Ontologies of Linguistic Annotation. Machine-readable tagsets and annotation schemata for more than 100 languages.☆20Updated last year
- Python Multilingual Ucrel Semantic Analysis System☆30Updated 3 months ago
- The Unicode Cookbook for Linguists☆53Updated 4 years ago
- Lexicons for the Multilingual UCREL Semantic Analysis System☆39Updated last year
- Most common sentences and words for all languages in the OpenSubtitles2018 corpus with Python code☆23Updated 2 years ago
- Word/n-gram frequency lists for the Google Books Ngram Corpus (v3, all languages) with Python code☆50Updated last year
- BlackLab Frontend, a feature-rich corpus search interface for BlackLab.☆16Updated this week
- Multilingual syllable annotation pipeline component for spacy☆37Updated last year
- Authoring tool for interactive content☆15Updated this week
- Wikidata property explorer☆15Updated 8 months ago
- Tool for the Automatic Analysis of Syntactic Sophistication and Complexity☆24Updated last year
- FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.g…☆110Updated 4 months ago
- The Data Format for Digital Linguistics (DaFoDiL)☆22Updated last year
- This repository provides various Python methods for finding and aggregating synonyms for an individual word or a list of words.☆32Updated last year
- Frontend for Korp, a tool using the IMS Open Corpus Workbench (CWB).☆16Updated this week
- Information extraction from English and German texts based on predicate logic☆135Updated last year
- [LREC 2020] EtymDB, an Etymological DataBase (v2.1)☆21Updated 2 years ago
- A sentence segmentation library with wide language support optimized for speed and utility.☆53Updated 2 months ago
- A Corpus Data Retrieval Index using Lucene for Look-Ups☆16Updated this week
- Gutenberg cache and query library☆36Updated 3 months ago
- A simple toolkit for conducting analyses using corpus methods☆24Updated 3 years ago
- A language evolution simulator, using realistic phonetic changes.☆38Updated last year
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆61Updated 6 months ago
- Framework for creating and accessing UBY resources – sense-linked lexical resources in standard UBY-LMF format☆22Updated 6 years ago
- This packages up data for the Open Multilingual Wordnet☆43Updated 3 weeks ago