delvinso / covid19_unique_tweetsLinks
An on-going dataset consisting of hashtags, n-gram counts and other misc NLP things for covid-19 analysis, stemming from over 100 000 000 tweets collected since mid-January 2020.
☆58Updated 3 years ago
Alternatives and similar repositories for covid19_unique_tweets
Users that are interested in covid19_unique_tweets are comparing it to the libraries listed below
Sorting:
- Quote extraction for modular journalism (JournalismAI collab 2021)☆230Updated 3 years ago
- Cleans Reddit Text Data☆83Updated 5 years ago
- Topic Inference with Zeroshot models☆61Updated 2 years ago
- Browse Covid-19 & SARS-CoV-2 Scientific Papers with Transformers 🦠 📖☆184Updated 3 years ago
- Python script to download public Tweets from a given Twitter account into a format suitable for AI text generation.☆226Updated 5 years ago
- Hate Speech Detection Library for Python.☆195Updated last month
- 📝🔍 A browser extension that displays the GPT-2 Log Probability of selected text☆112Updated 2 years ago
- Getting recommendations from natural language☆123Updated 5 years ago
- Social Analysis based on Whatsapp data☆148Updated 2 years ago
- Interpretable data visualizations for understanding how texts differ at the word level☆285Updated 10 months ago
- 📊 Semantic search for headlines and story text☆360Updated 2 years ago
- DRIFT is a tool for Diachronic Analysis of Scientific Literature.☆126Updated 2 months ago
- Semantic search through a vectorized Wikipedia (SentenceBERT) with the Weaviate vector search engine☆244Updated 2 years ago
- Hate speech dataset from Stormfront forum manually labelled at sentence level.☆175Updated 5 years ago
- Clean personally identifiable information from dirty dirty text using spaCy.☆41Updated 2 years ago
- Social Media Mining Toolkit (SMMT) main repository☆136Updated 3 years ago
- A repository to house model building experiments and tools that are part of the Conversation AI effort.☆143Updated 2 weeks ago
- Unreliable News Index (for Columbia Journalism Review)☆56Updated 3 years ago
- Topic modeling helpers using managed language models from Cohere. Name text clusters using large GPT models.☆222Updated 3 years ago
- List of datasets to apply stats/machine learning/technology to the world of social good.☆247Updated 5 years ago
- Text analysis with networks.☆291Updated last month
- open datasets for sentiment analysis based on tweets in English/Spanish/French/German/Italian☆75Updated 2 years ago
- A simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data…☆243Updated last year
- Healthsea is a spaCy pipeline for analyzing user reviews of supplementary products for their effects on health.☆92Updated 4 years ago
- A Flask webapp & Python scripts for predicting reddit users' political leaning, using their comment history.☆63Updated 2 years ago
- A multithread Pushshift.io API Wrapper for reddit.com comment and submission searches.☆220Updated 2 years ago
- The world's largest social media toxicity dataset.☆187Updated 3 years ago
- Here are the notebooks used during the spacy youtube series.☆103Updated 4 years ago
- Deep learning with text doesn't have to be scary.☆275Updated 2 years ago
- Sentence transformers models for SpaCy☆109Updated 2 years ago