☆15Apr 14, 2025Updated last year
Alternatives and similar repositories for PairJudgeRM
Users that are interested in PairJudgeRM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆17Dec 19, 2024Updated last year
- ☆23Sep 11, 2025Updated 7 months ago
- Website for TREC RAG☆14Apr 24, 2026Updated last week
- Control LLM☆23Apr 6, 2025Updated last year
- ☆15Sep 10, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Code for "What really matters in matrix-whitening optimizers?"☆23Oct 31, 2025Updated 6 months ago
- ☆34Oct 13, 2025Updated 6 months ago
- Identification of the Adversary from a Single Adversarial Example (ICML 2023)☆10Jul 15, 2024Updated last year
- A Collection of Papers on Diffusion Large Language Models☆46Updated this week
- ☆13Aug 17, 2020Updated 5 years ago
- ☆16Oct 18, 2024Updated last year
- Code repository for the paper on "Predicting the Performance of Black-Box LLMs through Self-Queries".☆12Jan 9, 2025Updated last year
- ☆13Jan 22, 2025Updated last year
- Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"☆17Mar 31, 2025Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Rethinking the Trust Region in LLM Reinforcement Learning☆53Mar 2, 2026Updated 2 months ago
- Official repository of paper "Context-DPO: Aligning Language Models for Context-Faithfulness"☆23Feb 17, 2025Updated last year
- ☆32Oct 30, 2023Updated 2 years ago
- World-Gymnast: Training Robots with Reinforcement Learning in a World Model☆34Feb 11, 2026Updated 2 months ago
- My personal site, using Wowchemy☆13Apr 24, 2026Updated last week
- Improving Your Model Ranking on Chatbot Arena by Vote Rigging (ICML 2025)☆27Feb 25, 2025Updated last year
- [TVCG & VR'25] LAPIG: Language Guided Projector Image Generation with Surface Adaptation and Stylization☆11Apr 16, 2026Updated 2 weeks ago
- ☆17Jan 9, 2025Updated last year
- Official implementation of Categorical Flow Maps on text.☆56Feb 16, 2026Updated 2 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Code and data release of the paper Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows☆15Oct 4, 2024Updated last year
- Sys2Bench is a benchmarking suite designed to evaluate reasoning and planning capabilities of large language models across algorithmic, l…☆30Mar 5, 2025Updated last year
- Please go to https://github.com/facebookresearch/stable_signature☆13Jul 26, 2023Updated 2 years ago
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆33Aug 5, 2025Updated 9 months ago
- Official repository for the paper Number Cookbook: Number Understanding of Language Models and How to Improve It.☆20Mar 31, 2025Updated last year
- To Think or Not to Think: Exploring the Unthinking Vulnerability in Large Reasoning Models☆33May 21, 2025Updated 11 months ago
- Codes for ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding [ICML 2025]]☆49Jul 22, 2025Updated 9 months ago
- https://footprints.baulab.info☆18Oct 4, 2024Updated last year
- CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation☆66May 21, 2025Updated 11 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Implementation code for ACL2024:Advancing Parameter Efficiency in Fine-tuning via Representation Editing☆15Apr 20, 2024Updated 2 years ago
- [ACL2024] Exploring the Potential of Large Language Models in Computational Argumentation☆18Aug 21, 2024Updated last year
- This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"☆73Apr 22, 2025Updated last year
- ☆51Apr 4, 2025Updated last year
- FlashInfer Bench @ MLSys 2026: Building AI agents to write high performance GPU kernels☆163Apr 26, 2026Updated last week
- ☆13Apr 9, 2026Updated 3 weeks ago
- The official repo for "VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search" [EMNLP25]☆39Feb 1, 2026Updated 3 months ago