This code implements a basic, Twitter-aware tokenizer.
☆12Feb 8, 2024Updated 2 years ago
Alternatives and similar repositories for happierfuntokenizing
Users that are interested in happierfuntokenizing are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Incivility classifier used in Theocharis et al (2020, Sage Open)☆20Aug 29, 2022Updated 3 years ago
- A python package for classifying emotion☆18Oct 20, 2020Updated 5 years ago
- This repository contains data of TikTok videos related to the 2024 U.S. Elections☆35Feb 17, 2025Updated last year
- Scrapes headlines from CNN and FOX, then has ChatGPT do cross-analysis☆11Apr 19, 2023Updated 2 years ago
- Classification of incivility in Reddit posts☆18Nov 19, 2020Updated 5 years ago
- Cornell INFO 3350: Text mining for history and literature, Fall 2020☆10Jan 14, 2021Updated 5 years ago
- ☆13Jul 26, 2023Updated 2 years ago
- Quickly make tables of descriptive statistics (i.e., counts, percentages, confidence intervals) for categorical variables. This package i…☆13Mar 5, 2026Updated 2 weeks ago
- A dsniff project using bro☆11Jan 25, 2016Updated 10 years ago
- GisPy: A Tool for Measuring Gist Inference Score in Text https://aclanthology.org/2022.wnu-1.5/☆13Jul 1, 2024Updated last year
- ☆13Jan 8, 2021Updated 5 years ago
- Beiwe is a smartphone-based digital phenotyping research platform. This repository contains some data analysis code.☆14Feb 10, 2020Updated 6 years ago
- Code and Hummingbird dataset for EMNLP 2021 paper "Does BERT Learn as Humans Perceive? Understanding Linguistic Styles through Lexica"☆14Apr 13, 2022Updated 3 years ago
- Models, scripts, and data sets for data annotation (aka coding, aka rating)☆12Mar 9, 2015Updated 11 years ago
- Senior A.I. project to generate realistic news articles like those found on CNN, NYTimes, Fox News, etc. Future research will involve con…☆15Apr 26, 2019Updated 6 years ago
- Ongoing list of useful lexica from Computational Social Science☆14Dec 14, 2020Updated 5 years ago
- Code for the experiments in the ACL 2020 paper "Estimating predictive uncertainty for rumour verification models"☆11May 15, 2020Updated 5 years ago
- Classify the kind of content hosted by the domain using the domain name, and text and screenshot of the homepage.☆16Dec 20, 2025Updated 3 months ago
- End to end human text analysis package, specifically suited for social media and social scientific applications. It is written in Python …☆129Updated this week
- SentencePersonality computes personality traits, as described in Big5 model, from myPersonality dataset.☆16Jun 21, 2020Updated 5 years ago
- Backtesting fbprophet prediction of Silver prices for 2017☆14Nov 29, 2017Updated 8 years ago
- A simple Wikipedia talk page parser☆11May 10, 2018Updated 7 years ago
- ☆46Oct 28, 2024Updated last year
- Repository for code and dataset for our EMNLP 2021 paper - “So You Think You’re Funny?”: Rating the Humour Quotient in Standup Comedy.☆15Sep 26, 2022Updated 3 years ago
- A Client for the MTurk Requester API☆16Apr 6, 2024Updated last year
- Using Centroids of Word Embeddings and Word Mover's Distance for Biomedical Document Retrieval in Question Answering.☆14Jul 13, 2017Updated 8 years ago
- This is an introduction to Chinese words segmentation using Jieba.☆14May 31, 2018Updated 7 years ago
- Corpus and annotations for the CL-Aff Shared Task - Get it #OffMyChest - from the University of Pennsylvania and Nanyang Technological Un…☆12Sep 12, 2021Updated 4 years ago
- Modelling Big Five Personality Inventory using Machine Learning algorithms☆22Nov 12, 2024Updated last year
- Library for processing MOOC data dumps. Currently limited to Coursera data.☆11Feb 11, 2021Updated 5 years ago
- CopyNet (Copy Mechanism in Seq2Seq) implementation with TensorFlow 2☆10Nov 21, 2022Updated 3 years ago
- Data and code from our stories, "Google Has a Secret Blocklist that Hides YouTube Hate Videos from Advertisers—But It’s Full of Holes," a…☆30Aug 23, 2021Updated 4 years ago
- Project Discourse Quality for political deliberations online.☆10Feb 1, 2021Updated 5 years ago
- Provides a minimal implementation to extract FLAN datasets for further processing☆11Feb 1, 2023Updated 3 years ago
- LIWC (Linguistic Inquiry and Word Count) in Python☆10Dec 8, 2022Updated 3 years ago
- Data and code for "Understanding Linearity of Cross-Lingual Word Embedding Mappings" (TMLR 2022)☆12Jun 8, 2022Updated 3 years ago
- Simple Python wrapper for querying data with TikTok's research API☆13Dec 25, 2023Updated 2 years ago
- Master Agentic Coding☆90Mar 16, 2026Updated last week
- TikTok-Teller: A TikTok Video Scraping and Content Analysis Tool☆20Nov 20, 2023Updated 2 years ago