☆110Jan 4, 2026Updated 2 months ago
Alternatives and similar repositories for IFBench
Users that are interested in IFBench are comparing it to the libraries listed below
Sorting:
- ☆19Aug 4, 2025Updated 7 months ago
- MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following☆16Oct 31, 2024Updated last year
- MEXMA: Token-level objectives improve sentence representations☆43Jan 6, 2025Updated last year
- ☆21Jul 21, 2025Updated 7 months ago
- ☆15Oct 4, 2024Updated last year
- Learning from preferences is a common paradigm for fine-tuning language models. Yet, many algorithmic design decisions come into play. Ou…☆32Apr 20, 2024Updated last year
- CLaMR: Contextualized Late-Interaction for Multimodal Content Retrieval☆23Jun 28, 2025Updated 8 months ago
- This is the github to open source benchmark AdvancedIF, see LAMA L1387358RCRO☆29Nov 26, 2025Updated 3 months ago
- Benchmarking Complex Instruction-Following with Multiple Constraints Composition (NeurIPS 2024 Datasets and Benchmarks Track)☆102Feb 20, 2025Updated last year
- ☆33Feb 4, 2026Updated last month
- ☆14May 31, 2022Updated 3 years ago
- Fork of Flame repo for training of some new stuff in development☆19Updated this week
- Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers☆28Mar 1, 2025Updated last year
- Code and data from the paper 'Human Feedback is not Gold Standard'☆20Feb 24, 2026Updated last week
- General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]☆223Nov 27, 2025Updated 3 months ago
- A repo for open research on building large reasoning models☆140Feb 18, 2026Updated 2 weeks ago
- ☆19Oct 2, 2023Updated 2 years ago
- ☆81Jun 23, 2025Updated 8 months ago
- Self-Supervised Alignment with Mutual Information☆20May 24, 2024Updated last year
- The official repository of NeurIPS'25 paper "Ada-R1: From Long-Cot to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization"☆22Nov 9, 2025Updated 3 months ago
- Learning to route instances for Human vs AI Feedback (ACL Main '25)☆27Jul 23, 2025Updated 7 months ago
- [JAG'26] SpatialLLM: From Multi-modality Data to Urban Spatial Intelligence☆59Jan 8, 2026Updated last month
- H.AI cookbook provides code examples and guides to help developers use models developed by H Company.☆66Feb 20, 2026Updated last week
- ☆32Jan 4, 2026Updated 2 months ago
- ☆32Jan 26, 2026Updated last month
- Scalable toolkit for efficient model alignment☆849Oct 6, 2025Updated 4 months ago
- Codebase for "Uni[MASK]: Unified Inference in Sequential Decision Problems"☆57Jul 3, 2024Updated last year
- [ACL 2023] Gradient Ascent Post-training Enhances Language Model Generalization☆29Sep 12, 2024Updated last year
- RewardBench: the first evaluation tool for reward models.☆697Feb 16, 2026Updated 2 weeks ago
- [ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆125Jun 11, 2025Updated 8 months ago
- [EMNLP 2025] Verification Engineering for RL in Instruction Following☆51Jan 5, 2026Updated 2 months ago
- Official repository for paper "Versatile Offline Imitation from Observations and Examples via Regularized State-Occupancy Matching" (ICML…☆28Jan 12, 2023Updated 3 years ago
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated 8 months ago
- The official implementation of Self-Exploring Language Models (SELM)☆63Jun 4, 2024Updated last year
- ☆43Aug 15, 2025Updated 6 months ago
- entropix style sampling + GUI☆27Oct 30, 2024Updated last year
- ☆39Aug 20, 2025Updated 6 months ago
- The open-source code of MetaStone-S1.☆106Aug 1, 2025Updated 7 months ago
- Official Code Release for "Training a Generally Curious Agent"☆45May 18, 2025Updated 9 months ago