haizelabs / Awesome-LLM-Judges
⚖️ Awesome LLM Judges ⚖️
☆86Updated last month
Alternatives and similar repositories for Awesome-LLM-Judges:
Users that are interested in Awesome-LLM-Judges are comparing it to the libraries listed below
- Scaling inference-time compute for LLM-as-a-judge, automated evaluations, guardrails, and reinforcement learning.☆189Updated last week
- Train your own SOTA deductive reasoning model☆81Updated 2 weeks ago
- Claude Deep Research config for Claude Code.☆155Updated last week
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆168Updated 2 months ago
- Letting Claude Code develop his own MCP tools :)☆90Updated 2 weeks ago
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆138Updated last month
- ☆106Updated this week
- A comprehensive repository of reasoning tasks for LLMs (and beyond)☆425Updated 5 months ago
- Prompt design in Python☆55Updated 3 months ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆52Updated last week
- Kura is a simple reproduction of the CLIO paper which uses language models to label user behaviour before clustering them based on embedd…☆91Updated last month
- smolLM with Entropix sampler on pytorch☆150Updated 4 months ago
- ☆144Updated 3 weeks ago
- ☆84Updated 6 months ago
- MLX port for xjdr's entropix sampler (mimics jax implementation)☆63Updated 4 months ago
- A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning☆97Updated this week
- ☆150Updated 3 months ago
- Just a bunch of benchmark logs for different LLMs☆119Updated 7 months ago
- Solving data for LLMs - Create quality synthetic datasets!☆145Updated 2 months ago
- look how they massacred my boy☆63Updated 5 months ago
- Simple examples using Argilla tools to build AI☆53Updated 4 months ago
- Code for the paper 🌳 Tree Search for Language Model Agents☆186Updated 8 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆91Updated 2 weeks ago
- Attribute (or cite) statements generated by LLMs back to in-context information.☆218Updated 5 months ago
- Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vecto…☆229Updated last month
- A user interface for DSPy☆140Updated 5 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆165Updated 2 weeks ago
- Red-Teaming Language Models with DSPy☆175Updated last month
- ☆97Updated 5 months ago
- ☆96Updated 5 months ago