mit-ccc / RadioTalk
The RadioTalk dataset of talk radio transcripts
☆59Updated 4 years ago
Alternatives and similar repositories for RadioTalk:
Users that are interested in RadioTalk are comparing it to the libraries listed below
- New York Times Word Innovation Types dataset☆21Updated 4 years ago
- Experiments to help discussion on Wikipedia talk pages☆66Updated 5 months ago
- finite-state toolkit, EM and Bayesian (Gibbs sampling) training for FST and context-free derivation forests☆41Updated 2 years ago
- Featurize words into orthographic and phonological vectors.☆40Updated last year
- Collaborative web framework for analyzing text (e.g., tweets). Supports standard labeling and pairwise comparison.☆14Updated 3 years ago
- R tools to download, ingest, and analyze the Phoenix dataset from the Open Event Data Alliance☆12Updated 8 years ago
- A pipeline for detecting novel information about entities from a stream of text, updating a knowledge base about the entities, and genera…☆32Updated 5 years ago
- Code for my blog post on Generating Words from Embeddings☆23Updated 9 months ago
- A Python package for audio annotation and classifier training. Developed in collaboration with the WGBH Foundation and the American Archi…☆17Updated 6 years ago
- Use spaCy for NLP and output to the FoLiA XML format.☆12Updated last year
- bin files☆13Updated 3 months ago
- automatically align transcribed audio and generate a wav2letter training corpus☆36Updated 2 years ago
- An API to access data from The New Yorker Caption Contest☆61Updated 2 years ago
- Repository for code and metadata to support work described in "Authorless Topic Models: Biasing Models Away from Known Structure"☆28Updated 4 years ago
- English web corpus with 4M tokens and several annotation types☆26Updated last year
- Demonstration of the results in "Text Normalization using Memory Augmented Neural Networks", Authors: Subhojeet Pramanik, Aman Hussain☆60Updated 5 years ago
- Easy to use ML model for spelling and sounding out words☆92Updated 9 months ago
- TopicScan: Visualization and validation interface for NMF Topic Modeling☆23Updated 4 years ago
- Materials to reproduce findings in our story, "Google’s Top Search Result? Surprise! It’s Google"☆34Updated 4 years ago
- A repository of materials for a proposed class on automated story bots.☆49Updated 6 years ago
- Visual analytics application for qualitative text analysis☆24Updated 2 years ago
- A simple interface to the Project Gutenberg corpus.☆17Updated 9 years ago
- ☆75Updated 3 years ago
- A guide to building language technology in new languages.☆58Updated 3 years ago
- Data and experiments with world population densities for comparison to addresses☆12Updated 8 years ago
- Code and data for Koenecke et al. (2020)☆28Updated 2 years ago
- ☆34Updated 3 years ago
- A real-time document recommendation system for speech streams☆19Updated 6 years ago
- ☆70Updated 3 months ago
- Analysis of gutenberg dataset☆44Updated 6 years ago