ash-neupane / multi-token-predLinks
Train toy models using multi-token prediction objective
☆13Updated last year
Alternatives and similar repositories for multi-token-pred
Users that are interested in multi-token-pred are comparing it to the libraries listed below
Sorting:
- official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and…☆66Updated 5 months ago
- ☆52Updated 2 months ago
- Reference implementation for Token-level Direct Preference Optimization(TDPO)☆148Updated 7 months ago
- A Sober Look at Language Model Reasoning☆83Updated last week
- The code for creating the iGSM datasets in papers "Physics of Language Models Part 2.1, Grade-School Math and the Hidden Reasoning Proces…☆76Updated 8 months ago
- ☆122Updated 6 months ago
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆123Updated 2 months ago
- ☆74Updated 10 months ago
- [NAACL 25 main] Awesome LLM Causal Reasoning is a collection of LLM-based casual reasoning works, including papers, codes and datasets.☆76Updated 7 months ago
- ☆100Updated last year
- ☆18Updated last month
- code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning☆41Updated last year
- [NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models☆55Updated 7 months ago
- ICML 2024 - Official Repository for EXO: Towards Efficient Exact Optimization of Language Model Alignment☆58Updated last year
- ☆45Updated 5 months ago
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…☆55Updated 2 years ago
- Preprint: Asymmetry in Low-Rank Adapters of Foundation Models☆36Updated last year
- Reinforcing General Reasoning without Verifiers☆83Updated 2 months ago
- Exploration of automated dataset selection approaches at large scales.☆47Updated 6 months ago
- LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning☆35Updated last year
- Reproduction of "RLCD Reinforcement Learning from Contrast Distillation for Language Model Alignment☆69Updated 2 years ago
- A brief and partial summary of RLHF algorithms.☆132Updated 6 months ago
- ☆26Updated 5 months ago
- Representation Surgery for Multi-Task Model Merging. ICML, 2024.☆46Updated 11 months ago
- ☆34Updated last year
- [ICLR 2025] MiniPLM: Knowledge Distillation for Pre-Training Language Models☆61Updated 10 months ago
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization☆87Updated last year
- Code for "Reasoning to Learn from Latent Thoughts"☆118Updated 5 months ago
- ☆47Updated 7 months ago
- "Improving Mathematical Reasoning with Process Supervision" by OPENAI☆113Updated last week