Inference-time scaling for LLMs-as-a-judge.
☆330Nov 5, 2025Updated 4 months ago
Alternatives and similar repositories for verdict
Users that are interested in verdict are comparing it to the libraries listed below
Sorting:
- ⚖️ Awesome LLM Judges ⚖️☆189Apr 28, 2025Updated 10 months ago
- a single interface around speech-to-speech foundation models☆27Jun 27, 2025Updated 8 months ago
- ☆20Apr 24, 2025Updated 10 months ago
- ☆37Aug 4, 2025Updated 7 months ago
- j1-micro (1.7B) & j1-nano (600M) are absurdly tiny but mighty reward models.☆102Jul 19, 2025Updated 7 months ago
- ☆29Oct 24, 2025Updated 4 months ago
- In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.☆449Feb 13, 2024Updated 2 years ago
- Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models …☆244Mar 2, 2026Updated last week
- A framework for optimizing DSPy programs with RL☆323Jan 12, 2026Updated last month
- Our library for RL environments + evals☆3,877Updated this week
- ☆137Mar 20, 2025Updated 11 months ago
- [NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards☆1,362Jan 16, 2026Updated last month
- Implementation for Decision-focused Summarization (EMNLP2021)☆12Mar 14, 2022Updated 3 years ago
- End-to-end Generative Optimization for AI Agents☆709Dec 10, 2025Updated 2 months ago
- Synthetic data curation for post-training and structured data extraction☆1,641Jan 24, 2026Updated last month
- Using open source LLMs to build synthetic datasets for direct preference optimization☆72Feb 29, 2024Updated 2 years ago
- QAlign is a new test-time alignment approach that improves language model performance by using Markov chain Monte Carlo methods.☆26Mar 2, 2026Updated last week
- Codebase from our first release.☆45Feb 17, 2026Updated 2 weeks ago
- A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.☆100Apr 13, 2025Updated 10 months ago
- ☆40Jul 26, 2024Updated last year
- Simple repository for training small reasoning models☆49Feb 17, 2026Updated 2 weeks ago
- nyc is so back☆21Jun 27, 2025Updated 8 months ago
- Simple and efficient DeepSeek V3 SFT using pipeline parallel and expert parallel, with both FP8 and BF16 trainings☆115Jul 27, 2025Updated 7 months ago
- ☆26Jan 14, 2025Updated last year
- Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from e…☆28May 23, 2024Updated last year
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆176Jan 16, 2025Updated last year
- Red-Teaming Language Models with DSPy☆254Feb 13, 2025Updated last year
- Extract full next-token probabilities via language model APIs☆248Feb 23, 2024Updated 2 years ago
- Easiest way to give context to LLMs; Attachments has the ambition to be the general funnel for any files to be transformed into images+te…☆349Sep 12, 2025Updated 5 months ago
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.☆1,602Dec 20, 2025Updated 2 months ago
- ☆162Dec 2, 2024Updated last year
- ☆237Updated this week
- Use Hermes-2-Pro-Mistral-7B function calling with your OpenAI API compatible code.☆18May 7, 2024Updated last year
- Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"☆17Mar 31, 2025Updated 11 months ago
- Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers☆28Mar 1, 2025Updated last year
- ☆123Feb 21, 2025Updated last year
- Automatic evals for LLMs☆581Feb 24, 2026Updated last week
- Optimizing inference proxy for LLMs☆3,352Jan 28, 2026Updated last month
- Go ahead and axolotl questions☆11,395Updated this week