Tencent / WeDLMLinks
WeDLM: The fastest diffusion language model with standard causal attention and native KV cache compatibility, delivering real speedups over vLLM-optimized baselines.
☆223Updated this week
Alternatives and similar repositories for WeDLM
Users that are interested in WeDLM are comparing it to the libraries listed below
Sorting:
- The offical repo for "Parallel-R1: Towards Parallel Thinking via Reinforcement Learning"☆246Updated last month
- Official repository for DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research☆483Updated this week
- codes for R-Zero: Self-Evolving Reasoning LLM from Zero Data (https://www.arxiv.org/pdf/2508.05004)☆714Updated last week
- Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.☆355Updated 6 months ago
- ☆1,245Updated last month
- Research code artifacts for Code World Model (CWM) including inference tools, reproducibility, and documentation.☆778Updated this week
- ToolOrchestra is an end-to-end RL training framework for orchestrating tools and agentic workflows.☆430Updated last week
- Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B☆553Updated last month
- Official Repository for "Glyph: Scaling Context Windows via Visual-Text Compression"☆542Updated last month
- OpenTinker is an RL-as-a-Service infrastructure for foundation models☆424Updated this week
- Scaling RL on advanced reasoning models☆650Updated 2 months ago
- Official implementation of "Continuous Autoregressive Language Models"☆676Updated last month
- Ring-V2 is a reasoning MoE LLM provided and open-sourced by InclusionAI.☆85Updated 2 months ago
- Agent0 Series: Self-Evolving Agents from Zero Data☆920Updated this week
- LIMI: Less is More for Agency☆155Updated 2 months ago
- LongCodeZip: Compress Long Context for Code Language Models [ASE2025]☆133Updated last month
- SSRL: Self-Search Reinforcement Learning☆199Updated 4 months ago
- Latent Collaboration in Multi-Agent Systems☆641Updated this week
- Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcemen…☆539Updated 3 months ago
- ☆452Updated last week
- A Scientific Multimodal Foundation Model☆624Updated 3 months ago
- ☆363Updated last month
- QeRL enables RL for 32B LLMs on a single H100 GPU.☆470Updated last month
- Next paradigm for LLM Agent. Unify plan and action through recursive code generation for adaptive, human-like decision-making.☆515Updated last month
- Code for paper "The Markovian Thinker: Architecture-Agnostic Linear Scaling of Reasoning"☆327Updated last month
- ☆306Updated 3 months ago
- ☆853Updated 3 months ago
- Implementation of the MetaController proposed in "Emergent temporal abstractions in autoregressive models enable hierarchical reinforceme…☆65Updated this week
- ☆185Updated last month
- The official code implementation for "Cache-to-Cache: Direct Semantic Communication Between Large Language Models"☆304Updated this week