ash-neupane / multi-token-predLinks
Train toy models using multi-token prediction objective
☆13Updated last year
Alternatives and similar repositories for multi-token-pred
Users that are interested in multi-token-pred are comparing it to the libraries listed below
Sorting:
- A Sober Look at Language Model Reasoning☆81Updated last month
- official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and…☆66Updated 4 months ago
- ☆71Updated 8 months ago
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆107Updated last month
- ☆117Updated 4 months ago
- ☆44Updated 3 months ago
- [NAACL 25 main] Awesome LLM Causal Reasoning is a collection of LLM-based casual reasoning works, including papers, codes and datasets.☆72Updated 5 months ago
- Reference implementation for Token-level Direct Preference Optimization(TDPO)☆146Updated 5 months ago
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…☆53Updated 2 years ago
- ☆18Updated last week
- [NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models☆52Updated 6 months ago
- ☆32Updated 9 months ago
- A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large …☆88Updated 7 months ago
- Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"☆92Updated 2 weeks ago
- ☆48Updated last month
- Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".☆100Updated this week
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization☆85Updated 11 months ago
- The code for creating the iGSM datasets in papers "Physics of Language Models Part 2.1, Grade-School Math and the Hidden Reasoning Proces…☆61Updated 6 months ago
- ☆24Updated 3 months ago
- Official Implementation for EMNLP 2024 (main) "AgentReview: Exploring Academic Peer Review with LLM Agent."☆83Updated 8 months ago
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆51Updated 2 months ago
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆86Updated 10 months ago
- End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning☆162Updated this week
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆123Updated 11 months ago
- Code for "Towards Revealing the Mystery behind Chain of Thought: a Theoretical Perspective"☆20Updated 2 years ago
- Representation Surgery for Multi-Task Model Merging. ICML, 2024.☆46Updated 10 months ago
- [NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$☆45Updated 9 months ago
- ☆155Updated 2 months ago
- [NeurIPS 2024 Spotlight] Code and data for the paper "Finding Transformer Circuits with Edge Pruning".☆59Updated 2 weeks ago
- exploring whether LLMs perform case-based or rule-based reasoning☆29Updated last year