yihedeng9 / rlhf-summary-notes
A brief and partial summary of RLHF algorithms.
☆93Updated 2 months ago
Alternatives and similar repositories for rlhf-summary-notes:
Users that are interested in rlhf-summary-notes are comparing it to the libraries listed below
- Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied wit…☆114Updated 7 months ago
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆115Updated 5 months ago
- ☆133Updated 2 months ago
- ☆95Updated 7 months ago
- What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective☆59Updated 3 months ago
- [NeurIPS'24 Spotlight] Observational Scaling Laws☆50Updated 4 months ago
- A curated list of awesome resources dedicated to Scaling Laws for LLMs☆69Updated last year
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"☆154Updated 2 months ago
- Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]☆130Updated 5 months ago
- Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆104Updated 3 weeks ago
- ☆89Updated last year
- ☆58Updated 9 months ago
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"☆121Updated 3 months ago
- ☆64Updated 2 weeks ago
- Function Vectors in Large Language Models (ICLR 2024)☆138Updated 4 months ago
- [NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"☆109Updated 11 months ago
- ☆44Updated 6 months ago
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization☆66Updated 6 months ago
- [NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*☆94Updated 2 months ago
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆77Updated 4 months ago
- open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality☆173Updated 6 months ago
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆95Updated 4 months ago
- [NeurIPS 2024 Spotlight] Code and data for the paper "Finding Transformer Circuits with Edge Pruning".☆46Updated 2 months ago
- FeatureAlignment = Alignment + Mechanistic Interpretability☆28Updated last month
- Repo of paper "Free Process Rewards without Process Labels"☆123Updated last month
- ☆54Updated 3 months ago
- ☆92Updated last month