OpenMOSS / LongLLaDALinks
[AAAI26] LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
☆50Updated last month
Alternatives and similar repositories for LongLLaDA
Users that are interested in LongLLaDA are comparing it to the libraries listed below
Sorting:
- ☆109Updated 3 months ago
- Official PyTorch implementation and models for paper "Diffusion Beats Autoregressive in Data-Constrained Settings". We find diffusion mod…☆118Updated 2 months ago
- [NeurIPS '25] Multi-Token Prediction Needs Registers☆26Updated 3 weeks ago
- Diffusion Language Models For Code Infilling Beyond Fixed-size Canvas☆97Updated 3 months ago
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆89Updated last year
- Official repository for paper "DeepCritic: Deliberate Critique with Large Language Models"☆40Updated 6 months ago
- Easy and Efficient dLLM Fine-Tuning☆190Updated 3 weeks ago
- SIFT: Grounding LLM Reasoning in Contexts via Stickers☆57Updated 10 months ago
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆51Updated 5 months ago
- Implementation of Negative-aware Finetuning (NFT) algorithm for "Bridging Supervised Learning and Reinforcement Learning in Math Reasonin…☆67Updated 4 months ago
- JudgeLRM: Large Reasoning Models as a Judge☆40Updated last month
- Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"☆55Updated this week
- Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".☆28Updated 3 months ago
- official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"☆36Updated 11 months ago
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆40Updated last year
- ☆19Updated last year
- ☆72Updated 6 months ago
- The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".☆15Updated 4 months ago
- ☆17Updated 5 months ago
- dParallel: Learnable Parallel Decoding for dLLMs☆53Updated 2 months ago
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆50Updated 8 months ago
- ☆47Updated 3 months ago
- ☆23Updated last year
- Esoteric Language Models☆108Updated last month
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆32Updated 5 months ago
- Remasking Discrete Diffusion Models with Inference-Time Scaling☆63Updated 10 months ago
- FROM $f(x)$ AND $g(x)$ TO $f(g(x))$: LLMs Learn New Skills in RL by Composing Old Ones☆55Updated 2 months ago
- ☆128Updated last month
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆31Updated 8 months ago
- Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization☆81Updated 2 weeks ago