mit-ccc / RadioTalkLinks
The RadioTalk dataset of talk radio transcripts
☆60Updated 4 years ago
Alternatives and similar repositories for RadioTalk
Users that are interested in RadioTalk are comparing it to the libraries listed below
Sorting:
- An API to access data from The New Yorker Caption Contest☆62Updated 2 years ago
- Experiments to help discussion on Wikipedia talk pages☆67Updated this week
- finite-state toolkit, EM and Bayesian (Gibbs sampling) training for FST and context-free derivation forests☆41Updated 2 years ago
- Featurize words into orthographic and phonological vectors.☆41Updated 2 years ago
- New York Times Word Innovation Types dataset☆21Updated 4 years ago
- Demonstration of the results in "Text Normalization using Memory Augmented Neural Networks", Authors: Subhojeet Pramanik, Aman Hussain☆60Updated 6 years ago
- PoKi: A Large Dataset of Poems by Children☆36Updated 6 months ago
- Markdown template for Dataseets for Datasets☆63Updated 3 years ago
- Automatic Measurement of Vowel Duration for Consonant Vowel Consonant (CVC) sound files (JASA 2016)☆14Updated 8 years ago
- Analysis of gutenberg dataset☆45Updated 6 years ago
- TopicScan: Visualization and validation interface for NMF Topic Modeling☆23Updated 5 years ago
- An example of how to use spaCy for extremely large files without running into memory issues☆36Updated 3 years ago
- Prosodic: a metrical-phonological parser, written in Python. For English and Finnish, with flexible language support.☆286Updated 6 months ago
- ☆74Updated this week
- Gamma Agreement in Python☆45Updated last year
- A Python package to facilitate research on building and evaluating automated scoring models.☆70Updated 8 months ago
- An implementation of latent Dirichlet allocation in javascript☆185Updated 3 years ago
- Code and data for Koenecke et al. (2020)☆30Updated 2 years ago
- ADS Project☆14Updated 9 years ago
- A Python interface to OpenFst☆88Updated 6 years ago
- A simple interface to the Project Gutenberg corpus.☆17Updated 9 years ago
- Open Source AI Benchmarking toolkit for benchmarking speech to text services☆57Updated last year
- Links to data used in Sproat & Jaitly (https://arxiv.org/abs/1611.00068) experiments.☆76Updated 4 years ago
- ☆22Updated 3 years ago
- Compare coverage across different media sources using the Juicer☆12Updated 9 years ago
- Punctuation generation for speech transcripts using lexical and prosodic features☆41Updated 6 years ago
- Forced Alignments for Common Voice☆31Updated 4 years ago
- Multi-lingual Text Processing☆96Updated 6 years ago
- Matrix tools for building and inspecting latent spaces☆27Updated 7 years ago
- ☆34Updated 3 years ago