korenyoni / opus-api
OPUS (opus.nlpl.eu) Python3 API
☆14Updated last week
Related projects ⓘ
Alternatives and complementary repositories for opus-api
- finite-state toolkit, EM and Bayesian (Gibbs sampling) training for FST and context-free derivation forests☆41Updated 2 years ago
- Finite-state script normalization and processing utilities☆38Updated this week
- Code and data for Koenecke et al. (2020)☆28Updated last year
- Utilities for manipulating finite state transducers with the OpenFst library.☆30Updated 7 years ago
- An Interactive Tool for Annotating Discourse Structure and Text Improvement☆16Updated 3 years ago
- Dataset used to analyze user preferences of podcast summaries☆8Updated 2 years ago
- NEAL (Nature+Energy Audio Labeller) is an open-source interactive audio data annotation tool.☆13Updated 2 months ago
- Labeled data for homograph disambiguation☆53Updated last year
- The EMU-webApp is an online and offline web application for labeling, visualizing and correcting speech and derived speech data.☆51Updated 2 months ago
- From a large speech audio file and its corresponding body of text, automatically chunk the audio and text into (phrase, audio_snippet) pa…☆17Updated 9 years ago
- A tiny BERT for low-resource monolingual models☆29Updated last month
- Scrapes some Finnish word definitions from English Wiktionary.☆7Updated last year
- ipapy is a Python module to work with International Phonetic Alphabet (IPA) strings☆81Updated 6 months ago
- American English Pronunciation Dictionary☆34Updated 6 years ago
- Simple, standalone python classes for training statistical language models using several popular smoothing methods.☆25Updated 12 years ago
- A real-time document recommendation system for speech streams☆19Updated 6 years ago
- A JAX library for building lattice-based speech transducer models☆40Updated 3 weeks ago
- Language data store and linguistic query API☆39Updated last month
- The Seshat audio annotation management platform☆13Updated 3 years ago
- A toolkit for producing n-gram language models. The highlights are the implementation of Kneser-Ney growing and revised Kneser pruning me…☆40Updated 2 months ago
- Thot toolkit for statistical machine translation☆50Updated 2 years ago
- This repository contains the DFKI Product Corpus, a dataset of 174 documents annotated for product and company named entities, and the re…☆12Updated 2 months ago
- Microsoft Speech Language Translation (MSLT) Corpus☆19Updated 7 years ago
- Featurize words into orthographic and phonological vectors.☆40Updated last year
- Expected edit distance implementation using OpenFst tools☆11Updated 9 years ago
- Development repository for Integrated Speech Corpus Analaysis (ISCAN)☆9Updated 2 years ago
- An alternative approach for probabilistic topic modeling based on agglomerative clustering of topics (not documents)☆12Updated 3 years ago
- ☆22Updated 2 years ago
- Lightweight utility tools for the detection of multiple spellings, meanings, and language-specific terminology in British and American En…☆15Updated 3 years ago
- 🐍🍑 Python 3 library for managing, annotating, and converting natural language corpuses using popular formats (CoNLL, ELAN, Praat, CSV, …☆18Updated 4 months ago