OpenMOSS / LongLLaDALinks
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
☆46Updated last week
Alternatives and similar repositories for LongLLaDA
Users that are interested in LongLLaDA are comparing it to the libraries listed below
Sorting:
- Official PyTorch implementation and models for paper "Diffusion Beats Autoregressive in Data-Constrained Settings". We find diffusion mod…☆113Updated last month
- [NeurIPS '25] Multi-Token Prediction Needs Registers☆25Updated last week
- Diffusion Language Models For Code Infilling Beyond Fixed-size Canvas☆91Updated 3 months ago
- ☆46Updated 2 months ago
- ☆105Updated 3 months ago
- ☆70Updated 5 months ago
- ☆17Updated 4 months ago
- Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".☆27Updated 2 months ago
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆89Updated last year
- Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"☆54Updated 2 months ago
- Remasking Discrete Diffusion Models with Inference-Time Scaling☆59Updated 9 months ago
- Official repository for paper "DeepCritic: Deliberate Critique with Large Language Models"☆41Updated 5 months ago
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆39Updated last year
- JudgeLRM: Large Reasoning Models as a Judge☆40Updated this week
- dParallel: Learnable Parallel Decoding for dLLMs☆44Updated 2 months ago
- official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"☆36Updated 10 months ago
- Implementation of Negative-aware Finetuning (NFT) algorithm for "Bridging Supervised Learning and Reinforcement Learning in Math Reasonin…☆65Updated 3 months ago
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆48Updated 4 months ago
- Esoteric Language Models☆108Updated 2 weeks ago
- The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".☆15Updated 3 months ago
- Geometric-Mean Policy Optimization☆95Updated 3 weeks ago
- A Sober Look at Language Model Reasoning☆89Updated 3 weeks ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆40Updated last month
- ☆23Updated last year
- Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation, ICML 2024☆22Updated last year
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆31Updated 4 months ago
- ☆19Updated 8 months ago
- ☆33Updated 11 months ago
- [EMNLP'25 Industry] Repo for "Z1: Efficient Test-time Scaling with Code"☆67Updated 8 months ago
- SIFT: Grounding LLM Reasoning in Contexts via Stickers☆57Updated 9 months ago