LBeaudoux / tatoebatoolsLinks
A library for fetching and reading Tatoeba's weekly exports
☆24Updated last year
Alternatives and similar repositories for tatoebatools
Users that are interested in tatoebatools are comparing it to the libraries listed below
Sorting:
- A Python Wiktionary Parser☆363Updated last month
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆106Updated this week
- A list of ~100,000 German nouns and their grammatical properties compiled from WiktionaryDE as CSV file. Plus a module to look up the dat…☆157Updated 8 months ago
- Wiktionary dump file parser and multilingual data extractor☆985Updated this week
- A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning tech…☆73Updated 8 months ago
- A versioned python wrapper package for cmudict (https://github.com/cmusphinx/cmudict).☆64Updated this week
- A Python library to parse MediaWiki WikiText☆312Updated 3 months ago
- Tatoeba is a platform whose purpose is to create a collaborative and open dataset of sentences and their translations.☆790Updated this week
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆171Updated 2 months ago
- A modern, interlingual wordnet interface for Python☆257Updated last month
- Massively multilingual pronunciation mining☆350Updated last week
- A list of resources for conservation, development, and documentation of endangered, minority, and low or under-resourced human languages.☆35Updated 2 years ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆53Updated 4 years ago
- Open morphology for Finnish☆92Updated last week
- SegBo: A database of borrowed sounds in the world’s languages☆16Updated last year
- Collaborative data curation for Glottolog☆170Updated 3 weeks ago
- a script and anki addon to turn KanjiVG data into colored stroke order diagrams☆127Updated last year
- The World Atlas of Language Structures☆61Updated 10 months ago
- The Language Learning Toolkit (LLTK) performs a variety of tasks useful for (human) language learning.☆41Updated 5 years ago
- Wiktionary parser tool for many language editions.☆54Updated 3 years ago
- The dictionary comprised of the Coptic lexicon created by the BBAW and interface by Coptic SCRIPTORIUM. Currently deployed at https://co…☆30Updated 7 months ago
- ☆15Updated last week
- Complete Conjugation of any Verb(e) in Catalan, French, Italian, Portuguese, Romanian or Spanish and conjugate unknown verbs using Machin…☆92Updated last year
- Python Finite-State Toolkit☆58Updated last week
- About 6,500 Irish lemmas ordered by corpus frequency, with noise removed.☆35Updated 7 years ago
- The World Atlas Of Language Structures Online☆129Updated 7 months ago
- A Python library for working with and comparing language codes.☆346Updated 3 months ago
- Jason Riggle's chart of phonological features in JSON format + extras☆54Updated last year
- A Python parser for MediaWiki wikicode☆823Updated 2 months ago
- Gather modern English word frequencies from all enwiki articles.☆222Updated last year