dustalov / evalicaLinks
Evalica, your favourite evaluation toolkit
☆37Updated 2 weeks ago
Alternatives and similar repositories for evalica
Users that are interested in evalica are comparing it to the libraries listed below
Sorting:
- ☆31Updated 8 months ago
- Effective LLM Alignment Toolkit☆131Updated 3 weeks ago
- Modified Arena-Hard-Auto LLM evaluation toolkit with an emphasis on Russian language☆42Updated 2 months ago
- ☆22Updated last year
- MERA (Multimodal Evaluation for Russian-language Architectures) is a new open benchmark for the Russian language for evaluating fundament…☆62Updated 7 months ago
- First-of-its-kind AI benchmark for evaluating the protection capabilities of large language model (LLM) guard systems (guardrails and saf…☆37Updated 3 weeks ago
- RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network trained to work with different pairs (images, texts).☆33Updated 2 years ago
- Framework for processing and filtering datasets☆27Updated 10 months ago
- SAGE: Spelling correction, corruption and evaluation for multiple languages☆153Updated 5 months ago
- Top ML papers of the week.☆31Updated this week
- Augmentex — a library for augmenting texts with errors☆64Updated 11 months ago
- Бенчмарк сравнивает русские аналоги ChatGPT: Saiga, YandexGPT, Gigachat☆61Updated last year
- Convert MUSE from TensorFlow to PyTorch and ONNX☆11Updated last year
- Gazeta: Dataset for automatic summarization of Russian news / Газета: набор данных для автоматического реферирования на русском языке☆35Updated 3 years ago
- ☆18Updated 2 months ago
- Bunch of notebooks for pre-training custom Saiga-like LLM☆13Updated last year
- RuTransform: python framework for adversarial attacks and text data augmentation for Russian☆19Updated last year
- MERA (Multimodal Evaluation for Russian-language Architectures) is a new open benchmark for the Russian language for evaluating SOTA mode…☆25Updated 2 months ago
- This project is concerned with my participating in the RuNNE competition https://github.com/dialogue-evaluation/RuNNE☆12Updated last year
- Репозиторий измеряет качество Yandexgpt, Gigachat, T-Pro, Saiga, Vikhr, Ruadapt на популярных англоязычных бенчмарках: MGSM, MATH, HumanE…☆23Updated last month
- Pipeline for easy fine-tuning of BERT architecture for sequence classification☆23Updated last year
- ☆17Updated 3 years ago
- RuBLiMP: Russian Benchmark of Linguistic Minimal Pairs☆17Updated 3 months ago
- Reinforcement Learning Library.☆28Updated 2 years ago
- Repository for the paper: "Revisiting BPR: A Replicability Study of a Common Recommender System Baseline"☆52Updated 6 months ago
- ☆57Updated last year
- MMLU eval for RU/EN☆15Updated last year
- ☆31Updated last month
- ☆46Updated last month
- ☆20Updated 10 months ago