☆46Mar 4, 2025Updated last year
Alternatives and similar repositories for DeltaBench
Users that are interested in DeltaBench are comparing it to the libraries listed below
Sorting:
- Official implementation of the paper “Reconsidering Overthinking: Penalizing Internal and External Redundancy in CoT Reasoning”☆20Aug 20, 2025Updated 6 months ago
- ☆17Mar 13, 2025Updated last year
- ☆76Jan 24, 2025Updated last year
- [ACL' 25] The official code repository for PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models.☆88Feb 15, 2025Updated last year
- Microsoft question-answering dataset☆10Jun 16, 2023Updated 2 years ago
- ☆23Jun 13, 2024Updated last year
- We introduce EfficientRAG, an efficient retriever for multi-hop question answering. EfficientRAG iteratively generates new queries withou…☆17Mar 4, 2025Updated last year
- ☆21Oct 10, 2025Updated 5 months ago
- Code, Data and Model for Paper "Learning from Peers in Reasoning Models"☆27May 13, 2025Updated 10 months ago
- PyTorch implementation for NAACL 2022 paper: "Document-Level Relation Extraction with Sentences Importance Estimation and Focusing"☆17Apr 29, 2022Updated 3 years ago
- ☆20Apr 16, 2025Updated 11 months ago
- ☆17Aug 30, 2025Updated 6 months ago
- (ICML 2025) Rethinking Chain-of-Thought from the Perspective of Self-Training☆13Feb 15, 2025Updated last year
- [ICLR 2026] Official repo for "Spotlight on Token Perception for Multimodal Reinforcement Learning"☆50Jan 30, 2026Updated last month
- An exploration of LLM steering☆25Jun 15, 2024Updated last year
- [ICLR 2025] Official codebase for the ICLR 2025 paper "Multimodal Situational Safety"☆30Jun 23, 2025Updated 8 months ago
- e☆43Apr 23, 2025Updated 10 months ago
- ☆16Jul 23, 2024Updated last year
- [ACL 2024] Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization☆29Jul 9, 2024Updated last year
- Paper list for the paper "Authorship Attribution in the Era of Large Language Models: Problems, Methodologies, and Challenges (SIGKDD Exp…☆18Dec 23, 2024Updated last year
- adds Sequence Parallelism into LLaMA-Factory☆12Dec 31, 2024Updated last year
- NOMU: Neural Optimization-based Model Uncertainty☆10Feb 17, 2023Updated 3 years ago
- ☆10Aug 24, 2023Updated 2 years ago
- Skill-Inject: Measuring Agent Vulnerability to Skill File Attacks☆34Feb 24, 2026Updated 3 weeks ago
- ☆36Aug 28, 2025Updated 6 months ago
- Official code and data repository of MathChat: MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Inte…☆22Jun 3, 2024Updated last year
- Official repository of paper "Context-DPO: Aligning Language Models for Context-Faithfulness"☆21Feb 17, 2025Updated last year
- ☆21Aug 8, 2025Updated 7 months ago
- JudgeLRM: Large Reasoning Models as a Judge☆41Jan 29, 2026Updated last month
- Cross-domain word representation learning☆10May 23, 2015Updated 10 years ago
- ☆40Aug 6, 2025Updated 7 months ago
- Code for our NeurIPS 2024 paper Improved Generation of Adversarial Examples Against Safety-aligned LLMs☆12Nov 7, 2024Updated last year
- Reasoning Agentic Retrieval-Augmented Generation for Industry Challenges☆28May 14, 2025Updated 10 months ago
- A toy text-to-image model trained from scratch.☆19Jun 9, 2025Updated 9 months ago
- ☆21Dec 14, 2024Updated last year
- [NeurIPS 2025@FoRLM] R1-Compress: Long Chain-of-Thought Compression via Chunk Compression and Search☆17Jan 24, 2026Updated last month
- Extensive Self-Contrast Enables Feedback-Free Language Model Alignment☆21Apr 2, 2024Updated last year
- ☆15May 30, 2025Updated 9 months ago
- [ICLR-2026] Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".☆32Feb 26, 2026Updated 3 weeks ago