Implementation for the paper "Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning"
☆11Jan 10, 2025Updated last year
Alternatives and similar repositories for Prereq_tune
Users that are interested in Prereq_tune are comparing it to the libraries listed below
Sorting:
- ProsodyLM: Uncovering the Emerging Prosody Processing Capabilities in Speech Language Models☆34Nov 18, 2025Updated 3 months ago
- ACL24☆11Jun 7, 2024Updated last year
- ☆13Jun 4, 2024Updated last year
- ☆14Jan 10, 2024Updated 2 years ago
- Official repository for "BLEUBERI: BLEU is a surprisingly effective reward for instruction following"☆31Jun 5, 2025Updated 8 months ago
- Official resource for paper Investigating and Mitigating the Multimodal Hallucination Snowballing in Large Vision-Language Models (ACL 20…☆15Aug 12, 2024Updated last year
- ☆17May 19, 2023Updated 2 years ago
- ☆16Jul 23, 2024Updated last year
- ☆15Feb 21, 2024Updated 2 years ago
- ☆22Jan 25, 2023Updated 3 years ago
- ☆23Dec 17, 2024Updated last year
- Data for our paper "Defending ChatGPT against Jailbreak Attack via Self-Reminder"☆20Oct 26, 2023Updated 2 years ago
- ☆23Mar 31, 2023Updated 2 years ago
- This repository includes a benchmark and code for the paper "Evaluating LLMs at Detecting Errors in LLM Responses".☆30Aug 18, 2024Updated last year
- RewardAnything: Generalizable Principle-Following Reward Models☆45Jun 11, 2025Updated 8 months ago
- Is In-Context Learning Sufficient for Instruction Following in LLMs? [ICLR 2025]☆32Jan 23, 2025Updated last year
- ☆22Sep 20, 2023Updated 2 years ago
- TBC☆28Nov 2, 2022Updated 3 years ago
- ☆31Sep 23, 2024Updated last year
- CodeUltraFeedback: aligning large language models to coding preferences (TOSEM 2025)☆73Jun 25, 2024Updated last year
- Implementation for "Correcting Diffusion Generation through Resampling" [CVPR 2024]☆34Dec 12, 2023Updated 2 years ago
- Code for "Can Retriever-Augmented Language Models Reason? The Blame Game Between the Retriever and the Language Model", EMNLP Findings 20…☆28Nov 2, 2023Updated 2 years ago
- [NeurIPS 2025] RL Tango: Reinforcing Generator and Verifier Together for Language Reasoning☆52Oct 23, 2025Updated 4 months ago
- Codebase for Instruction Following without Instruction Tuning☆36Sep 24, 2024Updated last year
- Research without Re-search: Maximal Update Parametrization Yields Accurate Loss Prediction across Scales☆32Jul 17, 2023Updated 2 years ago
- ☆34May 9, 2025Updated 9 months ago
- [ACL 2025 Main] Official Repository for "Evaluating Language Models as Synthetic Data Generators"☆41Dec 13, 2024Updated last year
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆39Jan 12, 2024Updated 2 years ago
- Code for ICML 25 paper "Metadata Conditioning Accelerates Language Model Pre-training (MeCo)"☆50Jun 30, 2025Updated 8 months ago
- The official repo for "CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models"☆29Feb 23, 2026Updated last week
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆52Jul 15, 2025Updated 7 months ago
- 用Paddle复现论文ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information(ACL2021)☆10Nov 15, 2021Updated 4 years ago
- The code for the paper "A Bayesian Approach to Online Planning" published in ICML 2024.☆13Jun 17, 2024Updated last year
- About Code release for "Imagination Mechanism: Mesh Information Propagation for Enhancing Data Efficiency in Reinforcement Learning"☆13Oct 7, 2023Updated 2 years ago
- code for polite☆11Feb 28, 2024Updated 2 years ago
- A collection of heat engines, based on the OpenAI Gym environment framework for use with reinforcement learning applications.☆15Dec 20, 2021Updated 4 years ago
- ☆11Jan 11, 2022Updated 4 years ago
- DreamSmooth: Improving Model-Based RL with Reward Smoothing (ICLR 2024)☆12May 6, 2024Updated last year
- ☆14Mar 21, 2024Updated last year