LBeaudoux / tatoebatools
A library for fetching and reading Tatoeba's weekly exports
☆20Updated 11 months ago
Related projects ⓘ
Alternatives and complementary repositories for tatoebatools
- A versioned python wrapper package for cmudict (https://github.com/cmusphinx/cmudict).☆62Updated 2 months ago
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆94Updated this week
- Code to create a database with cleaned up Wiktionary data and then to create ebook dictionaries based on this data.☆18Updated last year
- The Language Learning Toolkit (LLTK) performs a variety of tasks useful for (human) language learning.☆41Updated 5 years ago
- Simple word to frequency mappings for the german language based on text corpora and using CISTEM stemmer.☆11Updated 3 years ago
- A Python toolkit converting pronunciation in enwiktionary xml dump to cmudict format☆33Updated 5 years ago
- Creates interlinearized versions of books (EPUB, MOBI, etc), adding "subtitles" with translations under each word in the text.☆22Updated 4 years ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆52Updated 3 years ago
- Generation of bilingual dictionaries from Wiktionary/dbnary data for the WikDict project☆44Updated 3 weeks ago
- The World Atlas of Language Structures☆55Updated last month
- Offline bilingual dictionaries made using data from Wiktionary☆52Updated 9 years ago
- Tab-delimited word frequency list compiled from the German Wikipedia☆21Updated 3 years ago
- Wiktionary parser tool for many language editions.☆53Updated 2 years ago
- ☆67Updated 3 months ago
- The Wikinflection Corpus, from the paper "Wikinflection Corpus: A (Better) Multilingual, Morpheme-Annotated Inflectional Corpus" (Metheni…☆11Updated 11 months ago
- A list of vocabulary lists☆21Updated 4 years ago
- A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning tech…☆70Updated this week
- Extract data from German Wiktionary XML files.☆24Updated this week
- SegBo: A database of borrowed sounds in the world’s languages☆16Updated 8 months ago
- CogNet: a large-scale, high-quality cognate database for 338 languages, 1.07M words, and 8.1 million cognates☆43Updated last year
- Verb forms dictionary☆60Updated 7 years ago
- Python interface to ISLEX, an English IPA pronunciation dictionary with syllable and stress marking.☆47Updated 11 months ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆145Updated this week
- Interactive visualization of Wiktionary words and etymologies.☆90Updated this week
- Open source, updated Whitaker's Words Latin Dictionary and Morphology in Python☆52Updated 7 years ago
- The dictionary comprised of the Coptic lexicon created by the BBAW and interface by Coptic SCRIPTORIUM. Currently deployed at https://co…☆28Updated 3 months ago
- American English Pronunciation Dictionary☆34Updated 6 years ago
- Hanzipy is a Chinese character and NLP module for Chinese language processing for python. It is primarily written to help provide a frame…☆16Updated 10 months ago
- Listening-based language learning☆53Updated last year
- Prosodic: a metrical-phonological parser, written in Python. For English and Finnish, with flexible language support.☆277Updated 2 weeks ago