Brand24-AI / mms_benchmarkLinks
The most extensive open massively multilingual corpus of datasets for training sentiment models. The corpus consists of 79 manually selected from over 350 datasets reported in the scientific literature based on strict quality criteria and covers 27 languages.
☆16Updated last year
Alternatives and similar repositories for mms_benchmark
Users that are interested in mms_benchmark are comparing it to the libraries listed below
Sorting:
- The robust European language model benchmark.☆110Updated last week
- A Simple Bulk Labelling Tool☆589Updated 6 months ago
- ☆300Updated last year
- just a bunch of useful embeddings for scikit-learn pipelines☆502Updated 3 months ago
- A Hackable speech recognition library.☆25Updated 9 months ago
- ☆359Updated last year
- Optimus is a flexible and scalable framework built to train language models efficiently across diverse hardware configurations, including…☆66Updated 2 weeks ago
- A library for detecting problematic data segments in structured and unstructured data with few lines of code.☆64Updated last year
- A list of awesome open source projects in the machine learning field, who's developers are mainly based in Germany☆43Updated 10 months ago
- 💬 Language Identification with Support for More Than 2000 Labels -- EMNLP 2023☆144Updated last month
- Tool for named entity recognition for Polish based on deep learning.☆31Updated 2 years ago
- HF's ML for Audio study group☆194Updated 2 years ago
- ☆22Updated last year
- ☆370Updated 10 months ago
- A tokenizer, text cleaner, and phonemizer for many human languages.☆320Updated 8 months ago
- Advanced data structures for handling temporal segments with attached labels.☆114Updated 5 months ago
- Fine-tuning scripts for evaluating transformer-based models on KLEJ benchmark.☆26Updated 2 years ago
- A fast and lightweight python-based CTC beam search decoder for speech recognition.☆449Updated 2 years ago
- A Scandinavian Benchmark for sentence embeddings☆39Updated last month
- ☆106Updated 3 weeks ago
- 🫠 check your data, before you wreck your model☆16Updated 2 years ago
- AfriBERTa: Exploring the Viability of Pretrained Multilingual Language Models for Low-resourced Languages☆74Updated 3 years ago
- Efficient BM25 with DuckDB 🦆☆52Updated 6 months ago
- animal2vec: A self-supervised transformer for rare-event raw audio input☆25Updated 5 months ago
- ☆41Updated 2 months ago
- A small rust-based data loader☆30Updated last month
- A python package for benchmarking interpretability techniques on Transformers.☆213Updated 9 months ago
- Confection: the sweetest config system for Python☆187Updated 3 months ago
- ☆56Updated 2 years ago
- Speakerbox: Fine-tune Audio Transformers for speaker identification.☆58Updated 7 months ago