gkunter / coquery
Coquery is a free corpus query tool for linguists, lexicographers, translators, and anybody who wishes to search and analyse a text corpus.
☆18Updated 2 years ago
Related projects: ⓘ
- A Corpus Data Retrieval Index using Lucene for Look-Ups☆16Updated last week
- linguistics backend☆40Updated last year
- Wiktionary parser tool for many language editions.☆53Updated 2 years ago
- An advanced, extensible web front-end for the Manatee-open corpus search engine☆60Updated this week
- Multi Tier Annotation Search☆26Updated 3 years ago
- LingPy: Python library for quantitative tasks in historical linguistics☆122Updated 9 months ago
- eXtensible Interlinear Glossed Text☆31Updated 2 years ago
- The Wikinflection Corpus, from the paper "Wikinflection Corpus: A (Better) Multilingual, Morpheme-Annotated Inflectional Corpus" (Metheni…☆10Updated 9 months ago
- Linguistic search for large annotated text corpora, based on Apache Lucene☆103Updated this week
- Lexicons for the Multilingual UCREL Semantic Analysis System☆38Updated last year
- AUTOTYP data export☆38Updated last year
- The curation repository for the data behind Concepticon.☆32Updated this week
- [LREC 2020] EtymDB, an Etymological DataBase (v2.1)☆21Updated 2 years ago
- A part-of-speech tagger with support for domain adaptation and external resources.☆22Updated last year
- ANNIS is an open source, versatile web browser-based search and visualization architecture for complex multilevel linguistic corpora with…☆68Updated 3 weeks ago
- Multi Tier Annotation Search☆12Updated 4 months ago
- FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.g…☆110Updated 2 months ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆44Updated 3 years ago
- A Python module for interfacing with the Treetagger by Helmut Schmid.☆76Updated 3 years ago
- A cloud-based, open-source system for writing and publishing dictionaries.☆85Updated 8 months ago
- Python wrapper for the CWB to extract concordances and score frequency lists☆19Updated last month
- CogNet: a large-scale, high-quality cognate database for 338 languages, 1.07M words, and 8.1 million cognates☆42Updated last year
- Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic pr…☆65Updated last week
- Code for the paper: Wikinflection: Massive semi-supervised generation of multilingual inflectional corpus from Wiktionary (Metheniti and …☆8Updated 4 years ago
- A list of resources for conservation, development, and documentation of endangered, minority, and low or under-resourced human languages.☆34Updated last year
- Python framework for processing Universal Dependencies data☆55Updated last week
- Linguistica 5: Unsupervised Learning of Linguistic Structure☆30Updated 5 years ago
- Python for Linguists – a Gentle Introduction to Programming☆44Updated 8 years ago
- The World Atlas of Language Structures☆51Updated 2 months ago
- CONLL-U to Pandas DataFrame☆30Updated 6 years ago