Every Eval Ever is a shared schema and crowdsourced eval database. It defines a standardized metadata format for storing AI evaluation results — from leaderboard scrapes and research papers to local evaluation runs — so that results from different frameworks can be compared, reproduced, and reused.
☆82Jun 15, 2026Updated 2 weeks ago
Alternatives and similar repositories for every_eval_ever
Users that are interested in every_eval_ever are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- James' cookbook of evaluations and finetuning experiments☆28Feb 19, 2026Updated 4 months ago
- ☆14Jul 13, 2025Updated 11 months ago
- ERRor ANnotation Toolkit: Automatically extract and classify grammatical errors in parallel original and corrected sentences.☆12Mar 23, 2023Updated 3 years ago
- An official PyTorch implementation for CLIPPR☆31Jul 22, 2023Updated 2 years ago
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources☆154Oct 2, 2025Updated 9 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Generate synthetic labeled data for extremely low-resource languages using bilingual lexicons.☆20Oct 3, 2024Updated last year
- Build a Docker container to build, train and deploy fast.ai based Deep Learning models with Amazon SageMaker☆13Dec 15, 2018Updated 7 years ago
- official code for paper Probing the Decision Boundaries of In-context Learning in Large Language Models. https://arxiv.org/abs/2406.11233…☆20Jul 27, 2025Updated 11 months ago
- Repository for "Attribute First, then Generate: Locally-attributable Grounded Text Generation", ACL 2024☆30Dec 19, 2024Updated last year
- ☆11Oct 3, 2021Updated 4 years ago
- Deploy automl models for tabular tasks on AWS Sagemaker with AutoGluon☆13Feb 28, 2020Updated 6 years ago
- A curated reading list of research in Sparse Autoencoders, Feature Extraction and related topics in Mechanistic Interpretability☆32Jan 30, 2025Updated last year
- The official repo for "CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models"☆35Mar 26, 2026Updated 3 months ago
- This was designed for interp researchers who want to do research on or with interp agents to give quality of life improvements and fix …☆146Feb 8, 2026Updated 4 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- implement minimal pytorch from scratch☆22Apr 5, 2021Updated 5 years ago
- Open source pdf generation for focused teams☆17Nov 24, 2025Updated 7 months ago
- The AI that helps you achieve your goals☆11Feb 4, 2024Updated 2 years ago
- Blindspots in LLMs I've noticed while AI coding. Sonnet family emphasis.☆13Mar 20, 2025Updated last year
- Fast wavelet transforms on the sphere☆13Dec 20, 2016Updated 9 years ago
- 4-bit quantization of models using GPTQ☆18Mar 6, 2023Updated 3 years ago
- Accompanying codebase for neuroscope.io, a website for displaying max activating dataset examples for language model neurons☆14Feb 13, 2023Updated 3 years ago
- Automated terminal emulator benchmarks☆24Jun 22, 2026Updated last week
- ☆10Nov 1, 2022Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Sparsify transformers with cross-layer transcoders☆26Nov 14, 2025Updated 7 months ago
- Fast Image Integrity Checker: Scan for corrupted images using Nvidia DALI☆22Jun 20, 2021Updated 5 years ago
- Train and deploy ML models in the cloud☆41Jun 5, 2026Updated 3 weeks ago
- ☆10Nov 8, 2022Updated 3 years ago
- Analyze and cure awesome lists by collecting, processing and presenting data from listed Git projects.☆20Oct 12, 2025Updated 8 months ago
- Customizable charts made with TikZ and LaTeX3☆14Feb 11, 2023Updated 3 years ago
- Hypercorn is an ASGI and WSGI Server based on Hyper libraries and inspired by Gunicorn.☆20Jan 12, 2026Updated 5 months ago
- An analog touch screen joystick that pretends to be a bevy gamepad☆13Jul 13, 2024Updated last year
- Silicon Society Sandbox (SiliSocS) is an versatile and extensible experimentation system for EASE-configured generative multi-agent simul…☆29Updated this week
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Official Inspect Implementation for "ImpossibleBench: Measuring LLMs' Propensity of Exploiting Test Cases"☆45Dec 1, 2025Updated 7 months ago
- ☆13Jun 23, 2026Updated last week
- ☆14Jul 7, 2024Updated last year
- EvalAssist is an open-source project that simplifies using large language models as evaluators (LLM-as-a-Judge) of the output of other la…☆102Apr 9, 2026Updated 2 months ago
- ASSIST: Towards Label Noise-Robust Dialogue State Tracking☆10Apr 11, 2022Updated 4 years ago
- Flight Recorder allows to record client program execution and examine it later☆11Sep 18, 2020Updated 5 years ago
- ☆30Jul 2, 2025Updated 11 months ago