YangLing0818 / SuperCorrect-llmView external linksLinks
[ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction
☆87Mar 23, 2025Updated 10 months ago
Alternatives and similar repositories for SuperCorrect-llm
Users that are interested in SuperCorrect-llm are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2024 Spotlight] Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models☆677Jun 28, 2025Updated 7 months ago
- [NeurIPS 2025 Spotlight] ReasonFlux (long-CoT), ReasonFlux-PRM (process reward model) and ReasonFlux-Coder (code generation)☆519Sep 27, 2025Updated 4 months ago
- ☆123Feb 21, 2025Updated 11 months ago
- Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".☆26Aug 9, 2025Updated 6 months ago
- ☆19Mar 10, 2025Updated 11 months ago
- the datasets of our paper☆11Feb 26, 2024Updated last year
- ☆25Aug 23, 2024Updated last year
- ☆30Mar 11, 2025Updated 11 months ago
- ☆43Dec 16, 2025Updated last month
- ☆15Jul 22, 2024Updated last year
- Implementation of the Pairformer model used in AlphaFold 3☆14Updated this week
- [NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct☆191Jan 16, 2025Updated last year
- ☆25Jan 4, 2026Updated last month
- This is the official implementation of TAGCOS: Task-agnostic Gradient Clustered Coreset Selection for Instruction Tuning Data☆13Jul 21, 2024Updated last year
- ☆970Jan 23, 2025Updated last year
- ☆12Jun 30, 2024Updated last year
- Official Implementation for the paper "Integrative Decoding: Improving Factuality via Implicit Self-consistency"☆32Apr 12, 2025Updated 10 months ago
- AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories☆40Aug 7, 2025Updated 6 months ago
- Berkeley Single Cell Computational Microscopy dataset☆18Oct 27, 2025Updated 3 months ago
- Official implementation of "OpenCity3D: What do Vision-Language Models know about Urban Environments?" @ WACV2025☆16Nov 24, 2024Updated last year
- Repository for GeoUni, A Unified Model for Generating Geometry Diagrams, Problems and Problem Solutions.☆19Jun 12, 2025Updated 8 months ago
- Code and Data for ManyModalQA: Modality Disambiguation and QA over Diverse Inputs☆17Mar 2, 2020Updated 5 years ago
- FactCG: Enhancing Fact Checkers with Graph-Based Multi-Hop Data (NAACL 2025)☆14Jul 14, 2025Updated 7 months ago
- MathPrompter Implementation: This repository hosts an implementation based on the 'MathPrompter: Mathematical Reasoning Using Large Langu…☆14Apr 12, 2025Updated 10 months ago
- Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.☆459Apr 18, 2024Updated last year
- Code for Quiet-STaR☆740Aug 21, 2024Updated last year
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆65Oct 18, 2024Updated last year
- Examples for running TeNPy☆16Oct 31, 2025Updated 3 months ago
- Sequence-level 1F1B schedule for LLMs.☆19Jun 4, 2024Updated last year
- ☆24Oct 14, 2024Updated last year
- Training hybrid models for dummies.☆29Nov 1, 2025Updated 3 months ago
- Introducing Filtered Direct Preference Optimization (fDPO) that enhances language model alignment with human preferences by discarding lo…☆16Nov 27, 2024Updated last year
- 🌟Official code of our AAAI26 paper 🔍WebFilter☆35Nov 9, 2025Updated 3 months ago
- ☆18Mar 25, 2024Updated last year
- The official code release for Q#: Provably Optimal Distributional RL for LLM Post-Training☆18Mar 4, 2025Updated 11 months ago
- ☆16Oct 27, 2024Updated last year
- [ICLR 2026] Learning to Reason without External Rewards☆391Jan 26, 2026Updated 2 weeks ago
- Recipes to train the self-rewarding reasoning LLMs.☆229Mar 2, 2025Updated 11 months ago
- ☆22Mar 2, 2025Updated 11 months ago