A small library of LLM judges
☆328Jul 31, 2025Updated 7 months ago
Alternatives and similar repositories for judges
Users that are interested in judges are comparing it to the libraries listed below
Sorting:
- splits videos into scenes with gpt-4o-mini and saves them separately☆12Dec 19, 2024Updated last year
- TaskWeaver Plugins☆12Jan 28, 2024Updated 2 years ago
- moodist☆24Feb 20, 2026Updated 2 weeks ago
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.☆1,602Dec 20, 2025Updated 2 months ago
- ☆19Mar 16, 2025Updated 11 months ago
- [ICML '24] R2E: Turn any GitHub Repository into a Programming Agent Environment☆141Apr 20, 2025Updated 10 months ago
- Async RL Training at Scale☆1,107Updated this week
- Inference-time scaling for LLMs-as-a-judge.☆330Nov 5, 2025Updated 4 months ago
- ☆21Jun 4, 2024Updated last year
- This library supports evaluating disparities in generated image quality, diversity, and consistency between geographic regions.☆20Jun 3, 2024Updated last year
- FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes☆213Sep 18, 2025Updated 5 months ago
- Easy to use and open-source unknown stealer☆22Jul 24, 2023Updated 2 years ago
- ☆20Jan 7, 2024Updated 2 years ago
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI☆222Apr 29, 2024Updated last year
- a Python library that uses Reinforcement Learning (RL) to train LLMs.☆42Mar 1, 2026Updated last week
- Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…☆3,868May 17, 2025Updated 9 months ago
- Foyle is a copilot to help developers deploy and operate their applications.☆133Mar 17, 2025Updated 11 months ago
- ☆40Jul 26, 2024Updated last year
- Using Machine Learning to Create Funny Memes☆25Mar 2, 2023Updated 3 years ago
- Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …☆884Updated this week
- Analyzing the most strategic words to guess on Wordle, based on letter frequency distributions☆11Feb 20, 2022Updated 4 years ago
- ☆10Dec 3, 2020Updated 5 years ago
- ☆13Nov 5, 2024Updated last year
- ☆11Aug 25, 2021Updated 4 years ago
- The official implementation of Self-Play Fine-Tuning (SPIN)☆1,235May 8, 2024Updated last year
- Fast Multimodal Semantic Deduplication & Filtering☆892Jan 20, 2026Updated last month
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.☆2,915Updated this week
- Fast, Accurate, Lightweight Python library to make State of the Art Embedding☆2,759Updated this week
- Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard a…☆2,065Dec 3, 2025Updated 3 months ago
- ☆41Oct 3, 2024Updated last year
- Streamline on-policy/off-policy distillation workflows in a few lines of code☆96Feb 26, 2026Updated last week
- Curated list of datasets and tools for post-training.☆4,274Nov 10, 2025Updated 3 months ago
- Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets☆4,884Mar 2, 2026Updated last week
- Code for reproducing our paper: LMSOC: An Approach for Socially Sensitive Pretraining☆13Oct 22, 2021Updated 4 years ago
- Umap is a Python library that transforms OpenStreetMap data into customized maps with minimal code. Create minimalist or multi-layered ma…☆13Feb 25, 2026Updated last week
- decontamination☆26Updated this week
- A PyTorch implementation of Proxy Anchor Loss based on CVPR 2020 paper "Proxy Anchor Loss for Deep Metric Learning"☆11Jan 16, 2021Updated 5 years ago
- A guide to structured generation using constrained decoding☆14Jun 9, 2024Updated last year
- BERT score for text generation☆12Jan 15, 2025Updated last year