jcblaisecruz02 / Filipino-Text-BenchmarksLinks
Open-source benchmark datasets and pretrained transformer models in the Filipino language.
☆61Updated 9 months ago
Alternatives and similar repositories for Filipino-Text-Benchmarks
Users that are interested in Filipino-Text-Benchmarks are comparing it to the libraries listed below
Sorting:
- Fake news detection in Filipino via Multitask Transfer Learning☆15Updated 9 months ago
- AfriSenti-SemEval Shared Task 12: Sentiment Analysis for African languages : https://afrisenti-semeval.github.io/☆48Updated last year
- Machine learning models from Singapore's NLP research community☆36Updated 2 years ago
- Repository for the CommonLit Ease of Readability Corpus☆24Updated last year
- This repository contains a dataset for hate speech detection on social media platforms.☆72Updated 2 years ago
- ☆110Updated last year
- A module to compute textual lexical richness (aka lexical diversity).☆108Updated last year
- A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.☆82Updated 4 months ago
- Live survey of off-the-shelf language identification tools for python☆26Updated 3 years ago
- Dataset for Emotion Recognition Research☆211Updated 2 years ago
- MobileBERT and DistilBERT for extractive summarization☆89Updated last year
- A dataset for Indonesian Named Entity Recognizer☆30Updated 4 years ago
- Abstractive and Extractive Text summarization using Transformers.☆83Updated last year
- Multilingual abstractive summarization dataset extracted from WikiHow.☆91Updated 2 months ago
- Class for Aspect-term extraction and Aspect-based sentiment analysis with BERT and Adapters☆45Updated 2 years ago
- Datasets for Hate Speech Detection☆128Updated 2 years ago
- A multilingual lexicon of words to hurt.☆89Updated 7 months ago
- cLang-8 is a dataset for grammatical error correction.☆106Updated 2 years ago
- Benchmarking Multidomain English-Indonesian Machine Translation☆16Updated 4 years ago
- How to extract sentiment from opinions without any labels☆139Updated 3 years ago
- DeEpLearning models for MultIlingual haTespeech (DELIMIT): Benchmarking multilingual models across 9 languages and 16 datasets.☆109Updated last year
- Applying BERT to named entity recognition in English and Russian.☆162Updated 2 years ago
- Mega-COV: A Billion-Scale Dataset of 100+ Languages for COVID-19☆14Updated 4 years ago
- Repository for XLM-T, a framework for evaluating multilingual language models on Twitter data☆156Updated 2 years ago
- NusaWrites is an in-depth analysis of corpora collection strategy and a comprehensive language modeling benchmark for underrepresented an…☆25Updated 8 months ago
- Jojajovai Guarani-Spanish Parallel Corpus☆15Updated 2 years ago
- This repository contains the Arabic sarcasm dataset (ArSarcasm)☆24Updated 4 years ago
- Hindi NLP work☆14Updated 3 years ago
- Tagalog Words Stemmer using Python☆28Updated 2 years ago
- This is a simple Python package for calculating a variety of lexical diversity indices☆77Updated last year