haizelabs / verdictView external linksLinks
Inference-time scaling for LLMs-as-a-judge.
☆329Nov 5, 2025Updated 3 months ago
Alternatives and similar repositories for verdict
Users that are interested in verdict are comparing it to the libraries listed below
Sorting:
- ⚖️ Awesome LLM Judges ⚖️☆174Apr 28, 2025Updated 9 months ago
- a single interface around speech-to-speech foundation models☆27Jun 27, 2025Updated 7 months ago
- ☆20Apr 24, 2025Updated 9 months ago
- ☆37Aug 4, 2025Updated 6 months ago
- j1-micro (1.7B) & j1-nano (600M) are absurdly tiny but mighty reward models.☆102Jul 19, 2025Updated 6 months ago
- ☆29Oct 24, 2025Updated 3 months ago
- In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.☆447Feb 13, 2024Updated 2 years ago
- Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models …☆241Feb 9, 2026Updated last week
- A framework for optimizing DSPy programs with RL☆313Jan 12, 2026Updated last month
- Our library for RL environments + evals☆3,833Updated this week
- ☆137Mar 20, 2025Updated 10 months ago
- Synthetic data curation for post-training and structured data extraction☆1,631Jan 24, 2026Updated 3 weeks ago
- End-to-end Generative Optimization for AI Agents☆708Dec 10, 2025Updated 2 months ago
- Using open source LLMs to build synthetic datasets for direct preference optimization☆72Feb 29, 2024Updated last year
- QAlign is a new test-time alignment approach that improves language model performance by using Markov chain Monte Carlo methods.☆26Updated this week
- nyc is so back☆20Jun 27, 2025Updated 7 months ago
- A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.☆100Apr 13, 2025Updated 10 months ago
- ☆40Jul 26, 2024Updated last year
- Simple repository for training small reasoning models☆49Feb 6, 2025Updated last year
- A Qwen .5B reasoning model trained on OpenR1-Math-220k☆14Oct 11, 2025Updated 4 months ago
- Simple and efficient DeepSeek V3 SFT using pipeline parallel and expert parallel, with both FP8 and BF16 trainings☆115Jul 27, 2025Updated 6 months ago
- Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from e…☆28May 23, 2024Updated last year
- ☆26Jan 14, 2025Updated last year
- ☆25Nov 13, 2025Updated 3 months ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆176Jan 16, 2025Updated last year
- ☆223Feb 9, 2026Updated last week
- Red-Teaming Language Models with DSPy☆251Feb 13, 2025Updated last year
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.☆1,593Dec 20, 2025Updated last month
- Extract full next-token probabilities via language model APIs☆248Feb 23, 2024Updated last year
- ☆162Dec 2, 2024Updated last year
- Use Hermes-2-Pro-Mistral-7B function calling with your OpenAI API compatible code.☆18May 7, 2024Updated last year
- Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"☆17Mar 31, 2025Updated 10 months ago
- Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers☆27Mar 1, 2025Updated 11 months ago
- ☆123Feb 21, 2025Updated 11 months ago
- Automatic evals for LLMs☆578Dec 23, 2025Updated last month
- Optimizing inference proxy for LLMs☆3,324Jan 28, 2026Updated 2 weeks ago
- Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …☆853Feb 7, 2026Updated last week
- Go ahead and axolotl questions☆11,289Updated this week
- Late Interaction Models Training & Retrieval☆701Feb 9, 2026Updated last week