☆14Apr 14, 2025Updated last year
Alternatives and similar repositories for PairJudgeRM
Users that are interested in PairJudgeRM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆17Dec 19, 2024Updated last year
- ☆21Sep 11, 2025Updated 7 months ago
- Website for TREC RAG☆14Aug 19, 2025Updated 7 months ago
- Control LLM☆22Apr 6, 2025Updated last year
- ☆15Sep 10, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆33Oct 13, 2025Updated 6 months ago
- Identification of the Adversary from a Single Adversarial Example (ICML 2023)☆10Jul 15, 2024Updated last year
- A Collection of Papers on Diffusion Large Language Models☆45Mar 10, 2026Updated last month
- ☆16Oct 18, 2024Updated last year
- Code repository for the paper on "Predicting the Performance of Black-Box LLMs through Self-Queries".☆12Jan 9, 2025Updated last year
- ☆13Jan 22, 2025Updated last year
- Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"☆17Mar 31, 2025Updated last year
- Official repository of paper "Context-DPO: Aligning Language Models for Context-Faithfulness"☆23Feb 17, 2025Updated last year
- ☆32Oct 30, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- My personal site, using Wowchemy☆12Apr 2, 2026Updated 2 weeks ago
- Improving Your Model Ranking on Chatbot Arena by Vote Rigging (ICML 2025)☆26Feb 25, 2025Updated last year
- [TVCG & VR'25] LAPIG: Language Guided Projector Image Generation with Surface Adaptation and Stylization☆11Updated this week
- ☆17Jan 9, 2025Updated last year
- Code and data release of the paper Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows☆15Oct 4, 2024Updated last year
- Sys2Bench is a benchmarking suite designed to evaluate reasoning and planning capabilities of large language models across algorithmic, l…☆30Mar 5, 2025Updated last year
- Please go to https://github.com/facebookresearch/stable_signature☆13Jul 26, 2023Updated 2 years ago
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆32Aug 5, 2025Updated 8 months ago
- Official repository for the paper Number Cookbook: Number Understanding of Language Models and How to Improve It.☆20Mar 31, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- To Think or Not to Think: Exploring the Unthinking Vulnerability in Large Reasoning Models☆33May 21, 2025Updated 10 months ago
- Codes for ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding [ICML 2025]]☆48Jul 22, 2025Updated 8 months ago
- https://footprints.baulab.info☆18Oct 4, 2024Updated last year
- CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation☆66May 21, 2025Updated 10 months ago
- Implementation code for ACL2024:Advancing Parameter Efficiency in Fine-tuning via Representation Editing☆15Apr 20, 2024Updated last year
- [ACL2024] Exploring the Potential of Large Language Models in Computational Argumentation☆18Aug 21, 2024Updated last year
- This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"☆73Apr 22, 2025Updated 11 months ago
- ☆49Apr 4, 2025Updated last year
- FlashInfer Bench @ MLSys 2026: Building AI agents to write high performance GPU kernels☆157Apr 3, 2026Updated last week
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆12Apr 9, 2026Updated last week
- The official repo for "VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search" [EMNLP25]☆39Feb 1, 2026Updated 2 months ago
- MinPrompt: Graph-based Minimal Prompt Data Augmentation for Few-shot Question Answering☆14May 3, 2024Updated last year
- The official code repo and data hub of top_nsigma sampling strategy for LLMs.☆26Feb 11, 2025Updated last year
- ☆19May 17, 2025Updated 11 months ago
- A python implementation of PROCLUS: PROjected CLUStering algorithm.☆10Jan 12, 2015Updated 11 years ago
- A thesis template compliant with King's College London and UCL rules☆19Dec 14, 2025Updated 4 months ago