jcblaisecruz02 / Filipino-Text-Benchmarks
Open-source benchmark datasets and pretrained transformer models in the Filipino language.
☆58Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for Filipino-Text-Benchmarks
- Fake news detection in Filipino via Multitask Transfer Learning☆14Updated 2 months ago
- AfriSenti-SemEval Shared Task 12: Sentiment Analysis for African languages : https://afrisenti-semeval.github.io/☆46Updated 10 months ago
- Tagalog Words Stemmer using Python☆27Updated last year
- Dataset for Emotion Recognition Research☆203Updated last year
- ☆105Updated 11 months ago
- A repository for publicly/freely available Natural Language Processing (NLP) datasets for African languages.☆93Updated 6 months ago
- TUFS Asian Language Parallel Corpus☆49Updated last year
- Mega-COV: A Billion-Scale Dataset of 100+ Languages for COVID-19☆14Updated 3 years ago
- Arabic Dialect Identification on AOC data.☆23Updated 5 years ago
- A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.☆68Updated 2 months ago
- This repository contains a dataset for hate speech detection on social media platforms.☆66Updated last year
- A module to compute textual lexical richness (aka lexical diversity).☆93Updated last year
- XED multilingual emotion datasets☆56Updated last year
- This is a simple Python package for calculating a variety of lexical diversity indices☆65Updated last year
- How to extract sentiment from opinions without any labels☆137Updated 2 years ago
- This code provides word level language identification tool for identifying language for individual words in Code-Mixed text. e.g. The tex…☆50Updated 4 years ago
- Asent is a python library for performing efficient and transparent sentiment analysis using spaCy.☆115Updated 7 months ago
- Pretrained BERT model for analysing COVID-19 Twitter data☆184Updated last year
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.☆28Updated last year
- A multilingual lexicon of words to hurt.☆80Updated 2 weeks ago
- Mining individual characters in multiparty dialogue☆165Updated last year
- A collection of NLP resources for Malay☆25Updated 6 years ago
- DeEpLearning models for MultIlingual haTespeech (DELIMIT): Benchmarking multilingual models across 9 languages and 16 datasets.☆107Updated last year
- ☆16Updated last year
- Datasets for Hate Speech Detection☆115Updated last year
- Kamus morfologi untuk bahasa Melayu/Indonesia☆16Updated this week
- ☆29Updated 10 months ago
- A collection of preprocessed datasets and pretrained models for generating paraphrases.☆29Updated 3 years ago
- This repository contains the Arabic sarcasm dataset (ArSarcasm)☆23Updated 3 years ago
- Multimodal Meme Classification: Identifying Offensive Content in Image and Text☆66Updated last year