☆46Mar 4, 2025Updated last year
Alternatives and similar repositories for DeltaBench
Users that are interested in DeltaBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆18Mar 13, 2025Updated last year
- ☆79Jan 24, 2025Updated last year
- [ACL' 25] The official code repository for PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models.☆91Feb 15, 2025Updated last year
- ☆22Oct 10, 2025Updated 7 months ago
- ☆20Apr 16, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- (ICML 2025) Rethinking Chain-of-Thought from the Perspective of Self-Training☆13Feb 15, 2025Updated last year
- ☆22Aug 30, 2025Updated 8 months ago
- An exploration of LLM steering☆26Jun 15, 2024Updated last year
- [COLM'25] Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?☆38Jun 5, 2025Updated 11 months ago
- e☆43Apr 23, 2025Updated last year
- [ICLR 2025] Official codebase for the ICLR 2025 paper "Multimodal Situational Safety"☆35Jun 23, 2025Updated 10 months ago
- ☆16Jul 23, 2024Updated last year
- [ACL 2024] Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization☆29Jul 9, 2024Updated last year
- Paper list for the paper "Authorship Attribution in the Era of Large Language Models: Problems, Methodologies, and Challenges (SIGKDD Exp…☆19Apr 5, 2026Updated last month
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [arXiv 2024] FairVision: Equitable Deep Learning for Eye Disease Screening via Fair Identity Scaling☆16Apr 15, 2026Updated last month
- NOMU: Neural Optimization-based Model Uncertainty☆10Feb 17, 2023Updated 3 years ago
- adds Sequence Parallelism into LLaMA-Factory☆12Dec 31, 2024Updated last year
- [ICLR 2026] Official repo for "Spotlight on Token Perception for Multimodal Reinforcement Learning"☆62Apr 3, 2026Updated last month
- ☆10Aug 24, 2023Updated 2 years ago
- ☆36Apr 13, 2026Updated last month
- Official code and data repository of MathChat: MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Inte…☆22Jun 3, 2024Updated last year
- JudgeLRM: Large Reasoning Models as a Judge☆42May 6, 2026Updated 2 weeks ago
- ☆42Apr 8, 2026Updated last month
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Code for our NeurIPS 2024 paper Improved Generation of Adversarial Examples Against Safety-aligned LLMs☆12Nov 7, 2024Updated last year
- Reasoning Agentic Retrieval-Augmented Generation for Industry Challenges☆28May 14, 2025Updated last year
- A toy text-to-image model trained from scratch.☆19Jun 9, 2025Updated 11 months ago
- The official repo of FineSure (ACL-2024)☆36Jul 8, 2024Updated last year
- [NeurIPS 2025@FoRLM] R1-Compress: Long Chain-of-Thought Compression via Chunk Compression and Search☆17Jan 24, 2026Updated 3 months ago
- ☆21Dec 14, 2024Updated last year
- Official repository of paper "Context-DPO: Aligning Language Models for Context-Faithfulness"☆23Feb 17, 2025Updated last year
- Extensive Self-Contrast Enables Feedback-Free Language Model Alignment☆20Apr 2, 2024Updated 2 years ago
- ☆16May 30, 2025Updated 11 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [ICLR-2026] Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".☆32Feb 26, 2026Updated 2 months ago
- This repository contains data, code and models for contextual noncompliance.☆25Jul 18, 2024Updated last year
- Materials for "Multi-property Steering of Large Language Models with Dynamic Activation Composition"☆14Nov 22, 2024Updated last year
- [ACL 2025] iAgent: LLM Agent as a Shield between User and Recommender Systems☆30May 23, 2025Updated 11 months ago
- [ACL 2024] Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models☆27Jul 9, 2024Updated last year
- Links to publications that focus on the interpretation and analysis of in-context learning☆15Oct 17, 2024Updated last year
- [ICML 2025] Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment (https://arxiv.org/abs/2410.02197)☆41Sep 8, 2025Updated 8 months ago