remusao / wgraph
Etymological graphs based on Wiktionary dumps
☆18Updated last year
Related projects ⓘ
Alternatives and complementary repositories for wgraph
- English Lemma Database - Compiled by Referencing British National Corpus☆29Updated last month
- This repository contains code behind the visualization of the Wikimedia tool etytree at http://tools.wmflabs.org/etytree/☆50Updated 5 years ago
- Offline etymological dictionary based on Wiktionary data☆20Updated 2 years ago
- WordNet in JSON format.☆90Updated 4 years ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆51Updated 3 years ago
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆94Updated this week
- Interactive visualization of Wiktionary words and etymologies.☆90Updated last week
- [LREC 2020] EtymDB, an Etymological DataBase (v2.1)☆21Updated 2 years ago
- Word/n-gram frequency lists for the Google Books Ngram Corpus (v3, all languages) with Python code☆49Updated last year
- A library for fetching and reading Tatoeba's weekly exports☆20Updated 11 months ago
- CLDR text segmentation for JavaScript☆38Updated 6 months ago
- CogNet: a large-scale, high-quality cognate database for 338 languages, 1.07M words, and 8.1 million cognates☆43Updated last year
- Wiktionary parser tool for many language editions.☆53Updated 2 years ago
- A Node Based Wiktionary Parser☆15Updated 6 years ago
- Gather modern English word frequencies from all enwiki articles.☆202Updated 8 months ago
- Offline bilingual dictionaries made using data from Wiktionary☆52Updated 9 years ago
- All the words from Google Books, sorted by frequency☆109Updated last year
- An open etymology dataset created using Wiktionary data. Contains 3.8M entries, 1.8M terms, 2900 languages, and 31 unique relationship ty…☆75Updated 5 months ago
- A list of vocabulary lists☆21Updated 4 years ago
- The Open English WordNet☆473Updated this week
- Helsinki Finite-State Technology (library and application suite)☆122Updated 3 weeks ago
- Generation of bilingual dictionaries from Wiktionary/dbnary data for the WikDict project☆43Updated last week
- Prosodic: a metrical-phonological parser, written in Python. For English and Finnish, with flexible language support.☆276Updated this week
- Open Language Profiles — English profile datasets from CEFR-J☆99Updated 4 years ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆143Updated this week
- English lemmatizer☆65Updated last year
- Creates interlinearized versions of books (EPUB, MOBI, etc), adding "subtitles" with translations under each word in the text.☆22Updated 4 years ago
- The Global WordNet Association Collaborative Inter-Lingual Index☆40Updated this week
- Crawler for linguistic corpora☆192Updated 11 months ago
- Wikitionary in accessible JSON format☆34Updated last year