LBeaudoux / tatoebatoolsLinks
A library for fetching and reading Tatoeba's weekly exports
☆24Updated last year
Alternatives and similar repositories for tatoebatools
Users that are interested in tatoebatools are comparing it to the libraries listed below
Sorting:
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆104Updated last month
- A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning tech…☆73Updated 7 months ago
- A Python Wiktionary Parser☆361Updated 4 months ago
- A list of ~100,000 German nouns and their grammatical properties compiled from WiktionaryDE as CSV file. Plus a module to look up the dat…☆153Updated 6 months ago
- Complete Conjugation of any Verb(e) in Catalan, French, Italian, Portuguese, Romanian or Spanish and conjugate unknown verbs using Machin…☆90Updated last year
- Anki add-on to look up vocabulary using Wiktionary☆19Updated 4 months ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆52Updated 4 years ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆167Updated last month
- Open morphology for Finnish☆91Updated 2 months ago
- Wiktionary parser tool for many language editions.☆54Updated 2 years ago
- Most common sentences and words for all languages in the OpenSubtitles2018 corpus with Python code☆37Updated 5 months ago
- A versioned python wrapper package for cmudict (https://github.com/cmusphinx/cmudict).☆63Updated 2 weeks ago
- Offline bilingual dictionaries made using data from Wiktionary☆56Updated 10 years ago
- A list of vocabulary lists☆21Updated 5 years ago
- Wiktionary dump file parser and multilingual data extractor☆950Updated this week
- The Language Learning Toolkit (LLTK) performs a variety of tasks useful for (human) language learning.☆41Updated 5 years ago
- Listening-based language learning☆63Updated last year
- A modern, interlingual wordnet interface for Python☆254Updated last week
- An LL parser for extracting information from Wiki text, particularly Wiktionary.☆49Updated last year
- A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning tech…☆75Updated 3 weeks ago
- A MorphMan fork rebuilt from the ground up with a focus on simplicity, performance, and a codebase with minimal technical debt.☆89Updated this week
- a script and anki addon to turn KanjiVG data into colored stroke order diagrams☆127Updated last year
- Sources of Collatinus software - Latin lemmatizer, morphological analyzer and scansion☆76Updated 3 months ago
- A Python library to parse MediaWiki WikiText☆311Updated 2 months ago
- Extract data from German Wiktionary XML files.☆26Updated 6 months ago
- eXtensible Interlinear Glossed Text☆33Updated 3 years ago
- Creates interlinearized versions of books (EPUB, MOBI, etc), adding "subtitles" with translations under each word in the text.☆24Updated 4 years ago
- A jisho.org API made in Python☆83Updated 4 months ago
- SegBo: A database of borrowed sounds in the world’s languages☆16Updated last year
- Code to create a database with cleaned up Wiktionary data and then to create ebook dictionaries based on this data.☆25Updated last year