jcblaisecruz02 / Filipino-Text-BenchmarksLinks
Open-source benchmark datasets and pretrained transformer models in the Filipino language.
☆61Updated 10 months ago
Alternatives and similar repositories for Filipino-Text-Benchmarks
Users that are interested in Filipino-Text-Benchmarks are comparing it to the libraries listed below
Sorting:
- Fake news detection in Filipino via Multitask Transfer Learning☆15Updated 10 months ago
- Dataset for Emotion Recognition Research☆212Updated 2 years ago
- A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.☆83Updated 5 months ago
- Tagalog Words Stemmer using Python☆28Updated 2 years ago
- DeEpLearning models for MultIlingual haTespeech (DELIMIT): Benchmarking multilingual models across 9 languages and 16 datasets.☆109Updated 2 years ago
- A multilingual lexicon of words to hurt.☆89Updated this week
- This repository contains a dataset for hate speech detection on social media platforms.☆73Updated 2 years ago
- ☆109Updated last year
- TimeLMs: Diachronic Language Models from Twitter☆108Updated last year
- XED multilingual emotion datasets☆61Updated 2 years ago
- AfriSenti-SemEval Shared Task 12: Sentiment Analysis for African languages : https://afrisenti-semeval.github.io/☆47Updated last year
- Datasets for fake news and misinformation detection☆67Updated last year
- Testing and training detection models for emoji-based hate speech.☆24Updated 3 years ago
- Repository for XLM-T, a framework for evaluating multilingual language models on Twitter data☆157Updated 2 years ago
- Catalog of abusive language data (PLoS 2020)☆314Updated last year
- Röttger et al. (ACL 2021): "HateCheck: Functional Tests for Hate Speech Detection Models" - Data☆59Updated 3 years ago
- Datasets for Hate Speech Detection☆130Updated 2 years ago
- ☆150Updated 2 years ago
- Class for Aspect-term extraction and Aspect-based sentiment analysis with BERT and Adapters☆45Updated 2 years ago
- Project on detecting misinformation and fake news☆26Updated 5 years ago
- Hate speech dataset from Stormfront forum manually labelled at sentence level.☆173Updated 5 years ago
- ☆56Updated 2 years ago
- N-gram Extraction Approaches (bigrams, trigrams)☆44Updated 6 years ago
- Mega-COV: A Billion-Scale Dataset of 100+ Languages for COVID-19☆14Updated 4 years ago
- A repository for publicly/freely available Natural Language Processing (NLP) datasets for African languages.☆106Updated last year
- Repository for the CommonLit Ease of Readability Corpus☆24Updated last year
- Detect toxic spans in toxic texts☆69Updated 2 years ago
- Sentence transformers models for SpaCy☆107Updated 2 years ago
- open datasets for sentiment analysis based on tweets in English/Spanish/French/German/Italian☆72Updated last year
- Machine learning models from Singapore's NLP research community☆36Updated 2 years ago