CornellNLP / ConvoKit
ConvoKit is a toolkit for extracting conversational features and analyzing social phenomena in conversations. It includes several large conversational datasets along with scripts exemplifying the use of the toolkit on these datasets.
☆575Updated 2 months ago
Alternatives and similar repositories for ConvoKit:
Users that are interested in ConvoKit are comparing it to the libraries listed below
- Linguistic Inquiry and Word Count (LIWC) analyzer☆208Updated 3 years ago
- This repository contains EmoBank, a large-scale text corpus manually annotated with emotion according to the psychological Valence-Arousa…☆203Updated 2 years ago
- Annotated dataset of 100 works of fiction to support tasks in natural language processing and the computational humanities.☆352Updated 2 years ago
- Catalog of abusive language data (PLoS 2020)☆309Updated 9 months ago
- analyze text with empath☆324Updated 7 years ago
- Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenizati…☆666Updated last year
- Dialogue model that produces empathetic responses when trained on the EmpatheticDialogues dataset.☆479Updated 3 years ago
- A Survey and Experiments on Annotated Corpora for Emotion Classification in Text☆231Updated last year
- A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coher…☆1,225Updated last month
- Resources for the "SummEval: Re-evaluating Summarization Evaluation" paper☆390Updated 9 months ago
- Pipeline to generate the Standardized Project Gutenberg Corpus☆171Updated last year
- 📃Language Model based sentences scoring library☆307Updated 3 years ago
- Mining individual characters in multiparty dialogue☆170Updated last year
- To analyze and remove gender bias in coreference resolution systems☆77Updated 3 years ago
- Models to perform neural summarization (extractive and abstractive) using machine learning transformers and a tool to convert abstractive…☆429Updated last year
- A module to compute textual lexical richness (aka lexical diversity).☆104Updated last year
- Datasets for Hate Speech Detection☆124Updated last year
- Switchboard Dialog Act Corpus with Penn Treebank links☆144Updated 4 years ago
- A reading list of up-to-date papers on NLP for Social Good.☆301Updated last year
- Repository for TweetEval☆367Updated 2 years ago
- A dataset containing human-human knowledge-grounded open-domain conversations.☆646Updated 7 months ago
- BERTweet: A pre-trained language model for English Tweets (EMNLP-2020)☆589Updated 8 months ago
- Papers on fairness in NLP☆438Updated 10 months ago
- Hate speech dataset from Stormfront forum manually labelled at sentence level.☆171Updated 4 years ago
- Collection of tools for building diachronic/historical word vectors☆425Updated last year
- Scripts and links to recreate the ELI5 dataset.☆324Updated 3 years ago
- Clustering sentence embeddings to extract message intent☆173Updated 3 years ago
- A multilingual lexicon of words to hurt.☆86Updated 4 months ago
- 💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy☆733Updated 7 months ago
- Python package of Tomoto, the Topic Modeling Tool☆574Updated 7 months ago