Zyq-scut / RLTFView external linksLinks
Accepted by Transactions on Machine Learning Research (TMLR)
☆137Oct 5, 2024Updated last year
Alternatives and similar repositories for RLTF
Users that are interested in RLTF are comparing it to the libraries listed below
Sorting:
- Code for the TMLR 2023 paper "PPOCoder: Execution-based Code Generation using Deep Reinforcement Learning"☆118Jan 9, 2024Updated 2 years ago
- This is the official code for the paper CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning (Neur…☆558Jan 21, 2025Updated last year
- StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback☆74Aug 31, 2024Updated last year
- ☆44Jun 2, 2024Updated last year
- Training and Benchmarking LLMs for Code Preference.☆37Nov 15, 2024Updated last year
- Pytorch based implementation of Upside Down Reinforcement Learning (UDRL) by J. Schmidhuber et al.☆11May 1, 2020Updated 5 years ago
- ☆52Feb 12, 2025Updated last year
- Open Source WizardCoder Dataset☆164Jul 12, 2023Updated 2 years ago
- CodeUltraFeedback: aligning large language models to coding preferences (TOSEM 2025)☆73Jun 25, 2024Updated last year
- Official Implementation of NeurIPS'23 Paper "Cross-Episodic Curriculum for Transformer Agents"☆31Oct 12, 2023Updated 2 years ago
- ☆15Feb 28, 2024Updated last year
- evol augment any dataset online☆61Aug 3, 2023Updated 2 years ago
- Official repository for "BLEUBERI: BLEU is a surprisingly effective reward for instruction following"☆31Jun 5, 2025Updated 8 months ago
- Fine-tune SantaCoder for Code/Text Generation.☆196Apr 11, 2023Updated 2 years ago
- AskIt (for JavaScript/TypeScript): Unified programming interface for large language models (GPT-4, GPT-3.5)☆35Oct 1, 2023Updated 2 years ago
- Pytorch Implementation for "Preserving Linear Separability in Continual Learning by Backward Feature Projection" (CVPR 2023)☆18Jun 29, 2023Updated 2 years ago
- [NIPS2023] RRHF & Wombat☆808Sep 22, 2023Updated 2 years ago
- Self-Alignment with Principle-Following Reward Models☆169Sep 18, 2025Updated 4 months ago
- Scripts for generating synthetic finetuning data for reducing sycophancy.☆121Aug 16, 2023Updated 2 years ago
- [NAACL 2024] Struc-Bench: Are Large Language Models Good at Generating Complex Structured Tabular Data? https://aclanthology.org/2024.naa…☆55Jul 31, 2025Updated 6 months ago
- Astraios: Parameter-Efficient Instruction Tuning Code Language Models☆63Apr 10, 2024Updated last year
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆39Jan 12, 2024Updated 2 years ago
- ☆671Nov 1, 2024Updated last year
- For our ACL25 Paper: Can Language Models Replace Programmers? RepoCod Says ‘Not Yet’ - by Shanchao Liang and Yiran Hu and Nan Jiang and L…☆25Aug 27, 2025Updated 5 months ago
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"☆202Apr 17, 2025Updated 9 months ago
- Code for paper "LEVER: Learning to Verifiy Language-to-Code Generation with Execution" (ICML'23)☆90Jul 5, 2023Updated 2 years ago
- A recipe for online RLHF and online iterative DPO.☆539Dec 28, 2024Updated last year
- APPS: Automated Programming Progress Standard (NeurIPS 2021)☆501Jun 19, 2024Updated last year
- Data and code for "DocPrompting: Generating Code by Retrieving the Docs" @ICLR 2023☆251Dec 15, 2023Updated 2 years ago
- [NeurIPS'24] SelfCodeAlign: Self-Alignment for Code Generation☆323Feb 24, 2025Updated 11 months ago
- Package to optimize Adversarial Attacks against (Large) Language Models with Varied Objectives☆70Feb 22, 2024Updated last year
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆46Mar 29, 2024Updated last year
- This is the repo for the paper Shepherd -- A Critic for Language Model Generation☆222Aug 10, 2023Updated 2 years ago
- ☆25Dec 20, 2023Updated 2 years ago
- A framework for the evaluation of autoregressive code generation language models.☆1,020Jul 22, 2025Updated 6 months ago
- [NeurIPS 2023 D&B] Code repository for InterCode benchmark https://arxiv.org/abs/2306.14898☆240May 5, 2024Updated last year
- ☆11Dec 16, 2024Updated last year
- 🐣🕐📅 A simple utility to draft scheduling emails.☆12Sep 13, 2023Updated 2 years ago
- All Data Tools☆10Feb 28, 2023Updated 2 years ago