GFNOrg / gfn-lm-tuning
☆175Updated last year
Alternatives and similar repositories for gfn-lm-tuning:
Users that are interested in gfn-lm-tuning are comparing it to the libraries listed below
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"☆151Updated 5 months ago
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆120Updated 7 months ago
- ☆90Updated 9 months ago
- ☆93Updated last year
- Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"☆103Updated last year
- ☆114Updated 8 months ago
- ☆104Updated 5 months ago
- ☆79Updated last year
- A MAD laboratory to improve AI architecture designs 🧪☆111Updated 4 months ago
- Understand and test language model architectures on synthetic tasks.☆192Updated last month
- Function Vectors in Large Language Models (ICLR 2024)☆161Updated last week
- Language models scale reliably with over-training and on downstream tasks☆96Updated last year
- [NeurIPS'24 Spotlight] Observational Scaling Laws☆54Updated 6 months ago
- ☆91Updated 2 months ago
- Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models …☆169Updated this week
- ☆96Updated 9 months ago
- ☆137Updated 5 months ago
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"☆162Updated last week
- Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples☆84Updated last month
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆189Updated 10 months ago
- A brief and partial summary of RLHF algorithms.☆127Updated last month
- ☆218Updated 6 months ago
- Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging☆99Updated last year
- A library for efficient patching and automatic circuit discovery.☆62Updated 2 months ago
- Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]☆132Updated 7 months ago
- (ICML 2024) Alphazero-like Tree-Search can guide large language model decoding and training☆264Updated 10 months ago
- ☆161Updated 2 weeks ago
- LLM-Merging: Building LLMs Efficiently through Merging☆195Updated 7 months ago
- Open source replication of Anthropic's Crosscoders for Model Diffing☆52Updated 5 months ago
- ☆62Updated 2 years ago