☆86Nov 21, 2025Updated 6 months ago
Alternatives and similar repositories for mistral-evals
Users that are interested in mistral-evals are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Debug print operator for cudagraph debugging☆15Aug 2, 2024Updated last year
- ☆21Dec 14, 2024Updated last year
- ☆12Apr 18, 2025Updated last year
- codes for Efficient Test-Time Scaling via Self-Calibration☆20Sep 13, 2025Updated 8 months ago
- Tracking the history of the FARA data from https://www.justice.gov/nsd-fara☆16Aug 3, 2023Updated 2 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Datamodels for hugging face tokenizers☆107Apr 28, 2026Updated 3 weeks ago
- Python client SDK for Ultravox.☆16Dec 10, 2025Updated 5 months ago
- This is an implementation of the paper "Are We Done with Object-Centric Learning?"☆12Apr 12, 2026Updated last month
- From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning.☆25Oct 7, 2025Updated 7 months ago
- ☆14Jan 22, 2025Updated last year
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆42Dec 29, 2025Updated 4 months ago
- Quantized Attention on GPU☆44Nov 22, 2024Updated last year
- [ICML 2024] SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models☆22May 28, 2024Updated last year
- [NeurIPS'23] Uncertainty Estimation for Safety-critical Scene Segmentation via Fine-grained Reward Maximization☆19Aug 4, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Compare how fine-tuned AI video models interpret the same prompts☆14Jan 29, 2025Updated last year
- (ACL-IJCNLP 2021) Convolutions and Self-Attention: Re-interpreting Relative Positions in Pre-trained Language Models.☆21Jul 13, 2022Updated 3 years ago
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"☆66Oct 19, 2024Updated last year
- A Datasette plugin for making data visualizations with Observable Plot☆26Oct 21, 2025Updated 7 months ago
- This repository is designed for deploying and managing server processes that handle embeddings using the Infinity Embedding model or Larg…☆26Mar 6, 2025Updated last year
- [ACL 2024]Official GitHub repo for OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scie…☆192Jun 8, 2025Updated 11 months ago
- Efficient Feature Extraction for High-resolution Video Frame Interpolation (BMVC 2022)☆14Aug 24, 2023Updated 2 years ago
- ☆66May 15, 2026Updated last week
- Run evals using LLM☆27Jan 8, 2026Updated 4 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A curated list of resources related to structured generation 🔥☆23Jul 25, 2025Updated 10 months ago
- Test-time-training on nearest neighbors for large language models☆50Apr 18, 2024Updated 2 years ago
- The Official Implementation of Ada-KV [NeurIPS 2025]☆131Nov 26, 2025Updated 6 months ago
- AMD’s C++ library for accelerating tensor primitives☆49May 19, 2026Updated last week
- Efficiently discovering algorithms via LLMs with evolutionary search and reinforcement learning.☆17Apr 22, 2025Updated last year
- Cold Compress is a hackable, lightweight, and open-source toolkit for creating and benchmarking cache compression methods built on top of…☆151Aug 9, 2024Updated last year
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆49Feb 4, 2026Updated 3 months ago
- Data for the MTEB leaderboard☆55Updated this week
- Neural Reflectance Field from Shading and Shadow under a Fixed Viewpoint☆16Aug 8, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Code, Data and Model for Paper "Learning from Peers in Reasoning Models"☆27May 13, 2025Updated last year
- [NeurIPS 2025] More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models☆78May 31, 2025Updated 11 months ago
- Multimodal language model benchmark, featuring challenging examples☆187Dec 18, 2024Updated last year
- ☆11Sep 27, 2024Updated last year
- ☆25Apr 10, 2025Updated last year
- MathVista: data, code, and evaluation for Mathematical Reasoning in Visual Contexts☆359Sep 29, 2025Updated 7 months ago
- ☆11Apr 7, 2023Updated 3 years ago