sanowl / Self-Correcting-LLM--Reinforcement-Learning-
This my attempt to create Self-Correcting-LLM based on the paper Training Language Models to Self-Correct via Reinforcement Learning by google
โ16Updated last week
Related projects โ
Alternatives and complementary repositories for Self-Correcting-LLM--Reinforcement-Learning-
- [EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".โ42Updated 2 months ago
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. ๐งฎโจโ103Updated 6 months ago
- [NeurIPS'24] Official code for *๐ฏDART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*โ78Updated last month
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"โ33Updated 10 months ago
- โ22Updated this week
- Implementation of ICML 23 Paper: Specializing Smaller Language Models towards Multi-Step Reasoning.โ125Updated last year
- Evaluating Mathematical Reasoning Beyond Accuracyโ37Updated 7 months ago
- official implementation of paper "Process Reward Model with Q-value Rankings"โ14Updated 3 weeks ago
- [SIGIR'24] The official implementation code of MOELoRA.โ127Updated 3 months ago
- Implementation of the ICML 2024 paper "Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning" prโฆโ74Updated 9 months ago
- [NAACL 2024 Outstanding Paper] Source code for the NAACL 2024 paper entitled "R-Tuning: Instructing Large Language Models to Say 'I Don'tโฆโ83Updated 4 months ago
- โ51Updated 7 months ago
- Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied witโฆโ84Updated 4 months ago
- UniGen: A Unified Framework for Dataset Generation via Large Language Modelโ29Updated last month
- LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Modelsโ69Updated last month
- [EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"โ86Updated 2 months ago
- [NeurIPS 2024] Knowledge Circuits in Pretrained Transformersโ75Updated last month
- The code and data for the paper JiuZhang3.0โ35Updated 5 months ago
- [NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correctโ120Updated last week
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)โ97Updated 7 months ago
- [ACL 2024] Long-Context Language Modeling with Parallel Encodingsโ147Updated 5 months ago
- [ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".โ100Updated 2 weeks ago
- [ICML 2024] Selecting High-Quality Data for Training Language Modelsโ146Updated 5 months ago
- The official repository of the Omni-MATH benchmark.โ49Updated 2 weeks ago
- โ65Updated 6 months ago
- [ICML'2024] Can AI Assistants Know What They Don't Know?โ70Updated 9 months ago
- In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)โ45Updated 7 months ago
- โ54Updated 2 months ago
- Codes for papers on Large Language Models Personalization (LaMP)โ126Updated last month
- โ17Updated 4 months ago