mit-ccc / RadioTalk
The RadioTalk dataset of talk radio transcripts
☆57Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for RadioTalk
- New York Times Word Innovation Types dataset☆21Updated 3 years ago
- Python tools for text☆15Updated 4 years ago
- Code for my blog post on Generating Words from Embeddings☆23Updated 3 months ago
- Collaborative web framework for analyzing text (e.g., tweets). Supports standard labeling and pairwise comparison.☆14Updated 3 years ago
- Repository for code and metadata to support work described in "Authorless Topic Models: Biasing Models Away from Known Structure"☆28Updated 4 years ago
- TopicScan: Visualization and validation interface for NMF Topic Modeling☆23Updated 4 years ago
- A simple interface to the Project Gutenberg corpus.☆17Updated 8 years ago
- finite-state toolkit, EM and Bayesian (Gibbs sampling) training for FST and context-free derivation forests☆41Updated 2 years ago
- Featurize words into orthographic and phonological vectors.☆40Updated last year
- Analysis of gutenberg dataset☆40Updated 5 years ago
- ☆32Updated 2 years ago
- How (but not why) to do Twitter sociolinguistic analysis in the Unix Shell☆10Updated 8 years ago
- Matrix tools for building and inspecting latent spaces☆27Updated 6 years ago
- bin files☆13Updated 2 months ago
- R tools to download, ingest, and analyze the Phoenix dataset from the Open Event Data Alliance☆12Updated 8 years ago
- Code for learning geographically-informed word embeddings☆22Updated 2 years ago
- Jupyter extension to visualize dependency structures☆28Updated 6 years ago
- An API to access data from The New Yorker Caption Contest☆60Updated last year
- Dataset used to analyze user preferences of podcast summaries☆8Updated 2 years ago
- ☆14Updated 2 years ago
- Demonstration of the results in "Text Normalization using Memory Augmented Neural Networks", Authors: Subhojeet Pramanik, Aman Hussain☆60Updated 5 years ago
- A framework to identify relations between ideas in temporal text corpora.☆29Updated 6 years ago
- Experiments to help discussion on Wikipedia talk pages☆66Updated this week
- MiTextExplorer - interactive browser of text and document covariates.☆24Updated 9 years ago
- Code and data for Koenecke et al. (2020)☆28Updated last year
- Toolkit to compile a comparable/parallel corpus from European Parliament proceedings☆15Updated 4 years ago
- A simple neural truecaser written in pytorch and allennlp.☆32Updated 5 months ago
- Practical Approaches to Data Science with Text☆38Updated 4 years ago
- jiant-dev☆28Updated 3 years ago
- dataset of podcasts and episodes☆14Updated 6 years ago