LoLei/redditcleaner

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/LoLei/redditcleaner)

LoLei / redditcleaner

Cleans Reddit Text Data

☆83

Alternatives and similar repositories for redditcleaner

Users that are interested in redditcleaner are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jmhessel / catrank
View on GitHub
Pretrained models for the ranking task described in Cats and Captions vs. Creators and the Clock (WWW 2017)
☆11Apr 28, 2019Updated 7 years ago
kharrigian / smgeo
View on GitHub
Geolocation Inference for Reddit
☆14Jun 17, 2024Updated 2 years ago
gstonge / SamplableSet
View on GitHub
An efficient implementation of a set which can be randomly sampled according to the weights of the elements.
☆24May 13, 2024Updated 2 years ago
gkiril / MinSCIE
View on GitHub
MinScIE is an Open Information Extraction system which provides structured knowledge enriched with semantic information about citations.
☆15Jun 9, 2019Updated 7 years ago
pushshift / zreader
View on GitHub
Read compressed NDJSON .zst files easily
☆36Jun 23, 2026Updated last month
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
SussexCompSem / learninghypernyms
View on GitHub
Learning to Distinguish Hypernyms and Co-Hyponyms
☆18Nov 11, 2014Updated 11 years ago
nchambers / caevo
View on GitHub
A temporal ordering system for events and time expressions in written text.
☆43Feb 26, 2022Updated 4 years ago
JHUAPL / PINE
View on GitHub
Collaborative NLP annotation tool supporting enterprise authentication, inter-annotator statistics, active learning
☆13Mar 5, 2023Updated 3 years ago
fsolt / pewdata
View on GitHub
Reproducible Retrieval of Pew Research Center Datasets in R
☆10Apr 14, 2021Updated 5 years ago
pushshift / Parallel-NDJSON-Reader
View on GitHub
Parallel NDJSON Reader for Python
☆17Dec 4, 2019Updated 6 years ago
simonmunzert / hitler-speeches
View on GitHub
Supplementary and replication materials for paper "Examining a Most Likely Case for Strong Campaign Effects: Hitler's Speeches and the Ri…
☆15Jun 6, 2018Updated 8 years ago
osome-iu / osometweet
View on GitHub
OSoMe Twitter tools. Including a package like tweepy but for the v2 Twitter api.
☆31Jan 6, 2023Updated 3 years ago
igorbrigadir / twitter-history
View on GitHub
Tracking significant changes to the Twitter API or platform as a whole
☆20May 16, 2022Updated 4 years ago
jkbren / networkx-edge-bundling
View on GitHub
hacky way to bundle some edges in networkx and matplotlib
☆20May 26, 2020Updated 6 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
stanis-morozov / prodige
View on GitHub
A supplementary code for Beyond Vector Spaces: Compact Data Representation as Differentiable Weighted Graphs.
☆47Nov 2, 2019Updated 6 years ago
lmcinnes / subreddit_mapping
View on GitHub
Notebooks and data associated to constructing and exploring a map of subreddits.
☆56Apr 24, 2017Updated 9 years ago
daoudclarke / rte-experiment
View on GitHub
Experiments for recognising textual entailment
☆14Oct 12, 2012Updated 13 years ago
quanteda / quanteda.classifiers
View on GitHub
quanteda textmodel extensions for classifying documents
☆21Oct 17, 2023Updated 2 years ago
skupriienko / Ukrainian-Sentiment-Analysis
View on GitHub
The list of Ukrainian words for sentiment analysis and NLP
☆15Sep 5, 2021Updated 4 years ago
kashish-s / TruthSocial_2024ElectionInitiative
View on GitHub
This repository contains data of TruthSocial posts related to the 2024 U.S. Elections
☆12Nov 1, 2024Updated last year
LoicGrobol / ginger
View on GitHub
Format conversion and graphical representation of [Universal Dependencies](http://universaldependencies.org) trees.
☆12Sep 3, 2024Updated last year
yudai / golcs
View on GitHub
Go Longest Common Subsequence
☆24Oct 7, 2021Updated 4 years ago
adamlauretig / bwe
View on GitHub
Implements the model described in "Identification, Interpretability, and Bayesian Word Embeddings"
☆19Jun 5, 2019Updated 7 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
elehman16 / discq
View on GitHub
☆19Oct 13, 2022Updated 3 years ago
adinagit / tiktok-research-api-python
View on GitHub
Simple Python wrapper for querying data with TikTok's research API
☆13Dec 25, 2023Updated 2 years ago
SMAPPNYU / urlExpander
View on GitHub
🌬️urlExpander is a Python package for expanding shortened links (urls).
☆76Oct 5, 2022Updated 3 years ago
mkearney / rtweet.download
View on GitHub
{rtweet} helpers for automating large or time-consuming downloads
☆23Dec 11, 2019Updated 6 years ago
pdufter / densray
View on GitHub
Getting interpretable dimensions in word embedding spaces.
☆15Jul 6, 2023Updated 3 years ago
kstats / CausalQG
View on GitHub
☆15Apr 19, 2021Updated 5 years ago
yy / project-template
View on GitHub
A template for research repositories
☆27Jul 17, 2026Updated last week
ryanjgallagher / shifterator
View on GitHub
Interpretable data visualizations for understanding how texts differ at the word level
☆290Jun 30, 2026Updated 3 weeks ago
PxYu / entity-expansion
View on GitHub
Corpus-based Set Expansion with Lexical Features and Distributed Representations (SIGIR '19)
☆13Jul 18, 2019Updated 7 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
brianckeegan / Wikipedia
View on GitHub
Crawling and analyzing data on Wikipedia
☆17Mar 8, 2024Updated 2 years ago
qiangning / TemProb-NAACL18
View on GitHub
☆11Feb 8, 2022Updated 4 years ago
pushshift / rinzler
View on GitHub
A high performance indexing and search system for managing big data
☆18Mar 18, 2019Updated 7 years ago
abachaa / RQE_Data_AMIA2016
View on GitHub
The medical question entailment data introduced in the AMIA 2016 Paper (Recognizing Question Entailment for Medical Question Answering)
☆14May 13, 2026Updated 2 months ago
kenlimmj / fightin-words
View on GitHub
A scikit-learn compliant implementation of Monroe et al.'s Fightin' Words analysis method.
☆11May 26, 2026Updated 2 months ago
gchochla / Demux-MEmo
View on GitHub
[ICASSP'23] This repo contains code for the Demux & MEmo emotion recognition models (https://arxiv.org/abs/2210.15842), as well as code t…
☆23Jan 18, 2024Updated 2 years ago
diegma / span-gcn
View on GitHub
Code of the paper Graph Convolutions over Constituent Trees for Syntax-Aware Semantic Role Labeling
☆15Nov 15, 2020Updated 5 years ago