dustalov / evalica
Evalica, your favourite evaluation toolkit
☆32Updated 3 weeks ago
Alternatives and similar repositories for evalica:
Users that are interested in evalica are comparing it to the libraries listed below
- ☆31Updated 6 months ago
- Modified Arena-Hard-Auto LLM evaluation toolkit with an emphasis on Russian language☆39Updated last week
- Effective LLM Alignment Toolkit☆123Updated 2 weeks ago
- MERA (Multimodal Evaluation for Russian-language Architectures) is a new open benchmark for the Russian language for evaluating fundament…☆62Updated 5 months ago
- Бенчмарк сравнивает русские аналоги ChatGPT: Saiga, YandexGPT, Gigachat☆61Updated last year
- RuBLiMP: Russian Benchmark of Linguistic Minimal Pairs☆17Updated last month
- RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network trained to work with different pairs (images, texts).☆32Updated 2 years ago
- Augmentex — a library for augmenting texts with errors☆63Updated 8 months ago
- ☆22Updated last year
- SAGE: Spelling correction, corruption and evaluation for multiple languages☆150Updated 3 months ago
- MERA (Multimodal Evaluation for Russian-language Architectures) is a new open benchmark for the Russian language for evaluating SOTA mode…☆20Updated 4 months ago
- Framework for processing and filtering datasets☆27Updated 7 months ago
- Convert MUSE from TensorFlow to PyTorch and ONNX☆11Updated 10 months ago
- Top ML papers of the week.☆25Updated this week
- ☆11Updated last year
- ☆18Updated 3 months ago
- ☆26Updated this week
- Репозиторий измеряет качество Yandexgpt, Gigachat, T-Pro, Saiga, Vikhr, Ruadapt на популярных англоязычных бенчмарках: MGSM, MATH, HumanE…☆21Updated last week
- AI-generated text boundary detection with RoFT☆24Updated 6 months ago
- Bunch of notebooks for pre-training custom Saiga-like LLM☆13Updated last year
- ☆39Updated 2 weeks ago
- ☆43Updated 2 weeks ago
- RuTransform: python framework for adversarial attacks and text data augmentation for Russian☆19Updated last year
- Enterprise RAG Challenge to test accuracy of different LLM-driven assistants☆48Updated last week
- ☆20Updated 8 months ago
- ☆26Updated 3 weeks ago
- A set of scripts and configurations for pretraining of Large Language Models (LLM)☆28Updated 3 weeks ago
- Repository for the paper: "Revisiting BPR: A Replicability Study of a Common Recommender System Baseline"☆50Updated 4 months ago
- Official implementation of the paper "Linear Transformers with Learnable Kernel Functions are Better In-Context Models"☆159Updated 2 months ago
- MMLU eval for RU/EN☆15Updated last year