mit-ccc / RadioTalkLinks
The RadioTalk dataset of talk radio transcripts
☆60Updated 4 years ago
Alternatives and similar repositories for RadioTalk
Users that are interested in RadioTalk are comparing it to the libraries listed below
Sorting:
- Demonstration of the results in "Text Normalization using Memory Augmented Neural Networks", Authors: Subhojeet Pramanik, Aman Hussain☆60Updated 6 years ago
- Source code to accompany my paper "Poetic sound similarity vectors using phonetic features"☆171Updated 7 years ago
- Featurize words into orthographic and phonological vectors.☆41Updated 2 years ago
- An API to access data from The New Yorker Caption Contest☆62Updated 2 years ago
- New York Times Word Innovation Types dataset☆21Updated 4 years ago
- Open Source AI Benchmarking toolkit for benchmarking speech to text services☆56Updated last year
- Catalan ALBERT (A Lite BERT for self-supervised learning of language representations)☆14Updated 5 years ago
- Gamma Agreement in Python☆44Updated last year
- Prosodic: a metrical-phonological parser, written in Python. For English and Finnish, with flexible language support.☆283Updated 4 months ago
- Experiments to help discussion on Wikipedia talk pages☆66Updated 3 weeks ago
- A guide to building language technology in new languages.☆58Updated 3 years ago
- finite-state toolkit, EM and Bayesian (Gibbs sampling) training for FST and context-free derivation forests☆41Updated 2 years ago
- Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.☆129Updated 4 years ago
- Collaborative web framework for analyzing text (e.g., tweets). Supports standard labeling and pairwise comparison.☆14Updated 3 years ago
- Forced Alignments for Common Voice☆31Updated 4 years ago
- A simple interface to the Project Gutenberg corpus.☆17Updated 9 years ago
- ☆32Updated 4 years ago
- A Python package to facilitate research on building and evaluating automated scoring models.☆68Updated 6 months ago
- ☆22Updated 3 years ago
- Multilingual grapheme-to-phoneme conversion☆20Updated 7 years ago
- Corpus of oral arguments (recorded speech + official transcripts) of the United States Supreme Court☆22Updated 2 years ago
- Markdown template for Dataseets for Datasets☆63Updated 3 years ago
- Bias Tests for Voice Technologies (bt4vt)☆12Updated last year
- Code for my blog post on Generating Words from Embeddings☆23Updated 11 months ago
- Compare coverage across different media sources using the Juicer☆12Updated 9 years ago
- A Python package for audio annotation and classifier training. Developed in collaboration with the WGBH Foundation and the American Archi…☆17Updated 7 years ago
- Code and data used in named entity transliteration experiments☆57Updated 7 years ago
- A pipeline for detecting novel information about entities from a stream of text, updating a knowledge base about the entities, and genera…☆32Updated 5 years ago
- Punctuation generation for speech transcripts using lexical and prosodic features☆41Updated 6 years ago
- A Docker image for the Kaldi speech recognition tool + training data from Pop Up Archive☆20Updated 6 years ago