LBeaudoux / tatoebatoolsLinks
A library for fetching and reading Tatoeba's weekly exports
☆23Updated last year
Alternatives and similar repositories for tatoebatools
Users that are interested in tatoebatools are comparing it to the libraries listed below
Sorting:
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆101Updated 2 weeks ago
- A versioned python wrapper package for cmudict (https://github.com/cmusphinx/cmudict).☆63Updated last month
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆51Updated 3 years ago
- Offline bilingual dictionaries made using data from Wiktionary☆55Updated 10 years ago
- Generation of bilingual dictionaries from Wiktionary/dbnary data for the WikDict project☆49Updated 7 months ago
- The World Atlas of Language Structures☆61Updated 7 months ago
- The Language Learning Toolkit (LLTK) performs a variety of tasks useful for (human) language learning.☆41Updated 5 years ago
- Interactive visualization of Wiktionary words and etymologies.☆92Updated 3 months ago
- Wiktionary parser tool for many language editions.☆54Updated 2 years ago
- Code to create a database with cleaned up Wiktionary data and then to create ebook dictionaries based on this data.☆25Updated last year
- Python interface to ISLEX, an English IPA pronunciation dictionary with syllable and stress marking.☆52Updated last year
- A list of resources for conservation, development, and documentation of endangered, minority, and low or under-resourced human languages.☆34Updated 2 years ago
- A Python toolkit converting pronunciation in enwiktionary xml dump to cmudict format☆33Updated 5 years ago
- The Open Multilingual Wordnet☆61Updated last year
- The Global WordNet Association Collaborative Inter-Lingual Index☆42Updated 7 months ago
- Tools for scraping, annotating, and parsing morphological information from Wiktionary☆14Updated 5 years ago
- A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning tech…☆72Updated 6 months ago
- Python API to access glottolog/glottolog☆29Updated last week
- Master repo for the UniMorph project, includes the UniMorph schema and annotated data files☆30Updated 5 years ago
- Lexical data at Unicode☆68Updated 9 months ago
- Massively multilingual pronunciation mining☆341Updated 2 weeks ago
- Domain-specific programming language for linguistic grammars and transducers — Langage dédié pour les grammaires linguistiques et les tra…☆14Updated last week
- Collaborative data curation for Glottolog☆164Updated this week
- Frontend for Korp, a tool using the IMS Open Corpus Workbench (CWB).☆16Updated last week
- Analyze and manipulate your Anki flashcards using pandas dataframes!☆140Updated this week
- Listening-based language learning☆63Updated last year
- Wikidata lexemes presentations☆23Updated 2 months ago
- The Unicode Cookbook for Linguists☆54Updated 4 years ago
- An advanced, extensible web front-end for the Manatee-open corpus search engine☆65Updated last week
- Creates interlinearized versions of books (EPUB, MOBI, etc), adding "subtitles" with translations under each word in the text.☆24Updated 4 years ago