crlwingen / TagalogStemmerPython
Tagalog Words Stemmer using Python
☆27Updated last year
Related projects ⓘ
Alternatives and complementary repositories for TagalogStemmerPython
- ☆12Updated 4 years ago
- Repository for Philippine language dictionary data☆19Updated last year
- Open-source benchmark datasets and pretrained transformer models in the Filipino language.☆58Updated 2 months ago
- Datasets for fake news and misinformation detection☆63Updated last year
- A comparison and discussion of different NLP methods for 5-class sentiment classification on the SST-5 dataset.☆167Updated 3 months ago
- DeEpLearning models for MultIlingual haTespeech (DELIMIT): Benchmarking multilingual models across 9 languages and 16 datasets.☆107Updated last year
- Code for the paper "Characterizing and Detecting Hateful Users on Twitter"☆73Updated 3 years ago
- Code for the paper "Content Analysis of Textbooks via Natural Language Processing".☆56Updated last year
- Train unsupervised LDA Topic Model on raw Yelp review text, use topic distributions as feature inputs to supervised classifier of review …☆76Updated 5 years ago
- analyze text with empath☆315Updated 7 years ago
- Enhanced Subject Word Object Extraction☆148Updated 3 years ago
- Hate speech dataset from Stormfront forum manually labelled at sentence level.☆164Updated 4 years ago
- ☆226Updated 7 years ago
- A deep learning system for demographic inference (gender, age, and individual/person) that was trained on massive Twitter dataset using p…☆146Updated last year
- A Python wrapper around the topic modeling functions of MALLET.☆99Updated 3 weeks ago
- This is a simple Python package for calculating a variety of lexical diversity indices☆65Updated last year
- Project on detecting misinformation and fake news☆23Updated 5 years ago
- Catalog of abusive language data (PLoS 2020)☆304Updated 5 months ago
- Mega-COV: A Billion-Scale Dataset of 100+ Languages for COVID-19☆14Updated 3 years ago
- Pretrained BERT model for analysing COVID-19 Twitter data☆184Updated last year
- Cleans Reddit Text Data☆81Updated 4 years ago
- A multilingual lexicon of words to hurt.☆80Updated 2 weeks ago
- N-gram Extraction Approaches (bigrams, trigrams)☆42Updated 6 years ago
- Scrape news articles and analyze them using NLP to quantify the gender gap in Canadian mainstream media☆39Updated 6 months ago
- Natural language processing resources for multiple languages, with an eye towards use for digital humanities.☆124Updated 3 years ago
- MobileBERT and DistilBERT for extractive summarization☆87Updated last year
- Twitter word embeddings generated using Word2Vec and FastText.☆49Updated 5 years ago
- ☆160Updated last year
- 🔤 Calculate average word embeddings (word2vec) from documents for transfer learning☆54Updated 6 months ago
- Steam review texting embedding analysis☆141Updated last year