mit-ccc / RadioTalkLinks
The RadioTalk dataset of talk radio transcripts
☆60Updated 4 years ago
Alternatives and similar repositories for RadioTalk
Users that are interested in RadioTalk are comparing it to the libraries listed below
Sorting:
- New York Times Word Innovation Types dataset☆21Updated 4 years ago
- A simple interface to the Project Gutenberg corpus.☆17Updated 9 years ago
- Experiments to help discussion on Wikipedia talk pages☆66Updated 2 weeks ago
- ☆71Updated 4 months ago
- Featurize words into orthographic and phonological vectors.☆41Updated 2 years ago
- TopicScan: Visualization and validation interface for NMF Topic Modeling☆23Updated 4 years ago
- finite-state toolkit, EM and Bayesian (Gibbs sampling) training for FST and context-free derivation forests☆41Updated 2 years ago
- Code for my blog post on Generating Words from Embeddings☆23Updated 10 months ago
- Python tools for text☆15Updated 5 years ago
- Forced Alignments for Common Voice☆31Updated 4 years ago
- Visual analytics application for qualitative text analysis☆24Updated 2 years ago
- A pipeline for detecting novel information about entities from a stream of text, updating a knowledge base about the entities, and genera…☆32Updated 5 years ago
- dataset of podcasts and episodes☆14Updated 7 years ago
- Repository for code and metadata to support work described in "Authorless Topic Models: Biasing Models Away from Known Structure"☆28Updated 5 years ago
- Code and data for Koenecke et al. (2020)☆28Updated 2 years ago
- Collaborative web framework for analyzing text (e.g., tweets). Supports standard labeling and pairwise comparison.☆14Updated 3 years ago
- Matrix tools for building and inspecting latent spaces☆27Updated 6 years ago
- Toolkit to compile a comparable/parallel corpus from European Parliament proceedings☆16Updated 5 years ago
- An API to access data from The New Yorker Caption Contest☆62Updated 2 years ago
- Demonstration of the results in "Text Normalization using Memory Augmented Neural Networks", Authors: Subhojeet Pramanik, Aman Hussain☆60Updated 5 years ago
- Bias Tests for Voice Technologies (bt4vt)☆12Updated 11 months ago
- The repository for the paper "When Do You Need Billions of Words of Pretraining Data?"☆21Updated 4 years ago
- How (but not why) to do Twitter sociolinguistic analysis in the Unix Shell☆10Updated 9 years ago
- Aggressive reddit scraper in node js☆13Updated 10 years ago
- Source code to accompany my paper "Poetic sound similarity vectors using phonetic features"☆171Updated 7 years ago
- Materials for Frontiers of Computational Journalism, Columbia Journalism School 2018☆11Updated 6 years ago
- MoodCat😼 classifies the mood of English sentences.☆14Updated 2 years ago
- ☆22Updated 3 years ago
- A repository of materials for a proposed class on automated story bots.☆49Updated 6 years ago
- Catalan ALBERT (A Lite BERT for self-supervised learning of language representations)☆14Updated 4 years ago