☆34Sep 14, 2024Updated last year
Alternatives and similar repositories for PF-PPO-RLHF
Users that are interested in PF-PPO-RLHF are comparing it to the libraries listed below
Sorting:
- The code and data for the paper JiuZhang3.0☆49May 26, 2024Updated last year
- ☆16Jul 23, 2024Updated last year
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆39Jan 12, 2024Updated 2 years ago
- ☆50Aug 21, 2025Updated 6 months ago
- Dateset Reset Policy Optimization☆31Apr 12, 2024Updated last year
- ☆12Nov 5, 2024Updated last year
- LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification☆74Jul 14, 2025Updated 7 months ago
- Training and Benchmarking LLMs for Code Preference.☆38Nov 15, 2024Updated last year
- Code for our EMNLP 2019 paper titled "Sentence-Level Content Planning and Style Specification for Neural Text Generation"☆17May 4, 2020Updated 5 years ago
- ☆22Oct 22, 2024Updated last year
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"☆186May 25, 2025Updated 9 months ago
- ☆325Jul 25, 2024Updated last year
- About The official GitHub page for ''Unleashing the Potential of Large Language Models as Prompt Optimizers: An Analogical Analysis with …☆28Dec 12, 2024Updated last year
- Self-Supervised Alignment with Mutual Information☆20May 24, 2024Updated last year
- Train poincare embedding using gensim☆20May 18, 2018Updated 7 years ago
- SIFT: Grounding LLM Reasoning in Contexts via Stickers☆57Mar 6, 2025Updated last year
- papers of distilling Graph Neural Network☆24Dec 11, 2021Updated 4 years ago
- ☆29May 4, 2024Updated last year
- ☆98Jun 27, 2024Updated last year
- ☆32May 31, 2025Updated 9 months ago
- [ICML 2025] |TokenSwift: Lossless Acceleration of Ultra Long Sequence Generation