☆46Mar 4, 2025Updated last year
Alternatives and similar repositories for DeltaBench
Users that are interested in DeltaBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official implementation of the paper “Reconsidering Overthinking: Penalizing Internal and External Redundancy in CoT Reasoning”☆20Aug 20, 2025Updated 8 months ago
- ☆18Mar 13, 2025Updated last year
- ☆36Jan 7, 2025Updated last year
- [ACL' 25] The official code repository for PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models.☆91Feb 15, 2025Updated last year
- Code, Data and Model for Paper "Learning from Peers in Reasoning Models"☆26May 13, 2025Updated 11 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆20Apr 16, 2025Updated last year
- (ICML 2025) Rethinking Chain-of-Thought from the Perspective of Self-Training☆13Feb 15, 2025Updated last year
- An exploration of LLM steering☆26Jun 15, 2024Updated last year
- e☆43Apr 23, 2025Updated last year
- [ICLR 2025] Official codebase for the ICLR 2025 paper "Multimodal Situational Safety"☆33Jun 23, 2025Updated 10 months ago
- ☆16Jul 23, 2024Updated last year
- [ACL 2024] Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization☆29Jul 9, 2024Updated last year
- [arXiv 2024] FairVision: Equitable Deep Learning for Eye Disease Screening via Fair Identity Scaling☆16Apr 15, 2026Updated 2 weeks ago
- [ICLR 2026] Official repo for "Spotlight on Token Perception for Multimodal Reinforcement Learning"☆59Apr 3, 2026Updated 3 weeks ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆10Aug 24, 2023Updated 2 years ago
- ☆36Apr 13, 2026Updated 2 weeks ago
- Official code and data repository of MathChat: MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Inte…☆22Jun 3, 2024Updated last year
- ☆24Aug 8, 2025Updated 8 months ago
- Cross-domain word representation learning☆10May 23, 2015Updated 10 years ago
- JudgeLRM: Large Reasoning Models as a Judge☆41Apr 7, 2026Updated 3 weeks ago
- Reasoning Agentic Retrieval-Augmented Generation for Industry Challenges☆28May 14, 2025Updated 11 months ago
- [NeurIPS 2025@FoRLM] R1-Compress: Long Chain-of-Thought Compression via Chunk Compression and Search☆17Jan 24, 2026Updated 3 months ago
- ☆21Dec 14, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Extensive Self-Contrast Enables Feedback-Free Language Model Alignment☆20Apr 2, 2024Updated 2 years ago
- Official repository of paper "Context-DPO: Aligning Language Models for Context-Faithfulness"☆23Feb 17, 2025Updated last year
- ☆16May 30, 2025Updated 10 months ago
- [ICLR-2026] Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".☆32Feb 26, 2026Updated 2 months ago
- This repository contains data, code and models for contextual noncompliance.☆25Jul 18, 2024Updated last year
- [ACL 2024] Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models☆27Jul 9, 2024Updated last year
- ☆14Jun 27, 2019Updated 6 years ago
- This repository is for the "LLM-Aligned Geographic Item Tokenization for Local-Life Recommendation".☆17Nov 18, 2025Updated 5 months ago
- [ICML 2025] Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment (https://arxiv.org/abs/2410.02197)☆40Sep 8, 2025Updated 7 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- [NeurIPS 2025] Official Implementation of paper "Sherlock: Self-Correcting Reasoning in Vision-Language Models"☆29Sep 18, 2025Updated 7 months ago
- Collection of peptide de novo sequencing algorithms by BEAM labs☆30Dec 6, 2025Updated 4 months ago
- ☆10Aug 23, 2022Updated 3 years ago
- ☆15May 28, 2024Updated last year
- Code to reproduce the paper "Do causal predictors generalize better to new domains?"☆16Feb 7, 2025Updated last year
- Official repo for ACL 2023 paper Code4Struct: Code Generation for Few-Shot Structured Prediction from Natural Language.☆43Jan 7, 2024Updated 2 years ago
- UQ: Assessing Language Models on Unsolved Questions☆30Aug 26, 2025Updated 8 months ago