Tongyi-Zhiwen / QwenLong-L1Links
☆277Updated last month
Alternatives and similar repositories for QwenLong-L1
Users that are interested in QwenLong-L1 are comparing it to the libraries listed below
Sorting:
- ☆77Updated 3 months ago
- ☆89Updated last month
- [ICML 2025] |TokenSwift: Lossless Acceleration of Ultra Long Sequence Generation☆110Updated last month
- ☆154Updated 2 months ago
- Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling☆410Updated last month
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆135Updated last year
- [ICML 2025] Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale☆251Updated this week
- A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.☆157Updated this week
- [COLM 2025] An Open Math Pre-trainng Dataset with 370B Tokens.☆95Updated 3 months ago
- ☆318Updated 9 months ago
- Deep Reasoning Translation (DRT) Project☆225Updated last month
- A highly capable 2.4B lightweight LLM using only 1T pre-training data with all details.☆195Updated last week
- Ling is a MoE LLM provided and open-sourced by InclusionAI.☆175Updated last month
- Mixture-of-Experts (MoE) Language Model☆189Updated 10 months ago
- DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents☆174Updated 3 weeks ago
- Efficient Agent Training for Computer Use☆111Updated last month
- The RedStone repository includes code for preparing extensive datasets used in training large language models.☆135Updated last week
- Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models☆216Updated 2 weeks ago
- MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search too…☆241Updated last week
- ☆94Updated 7 months ago
- Repo for "MaskSearch: A Universal Pre-Training Framework to Enhance Agentic Search Capability"☆129Updated last month
- A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning☆227Updated last month
- Implementation for OAgents: An Empirical Study of Building Effective Agents☆76Updated last week
- Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs☆178Updated 3 weeks ago
- ☆64Updated last month
- Delta-CoMe can achieve near loss-less 1-bit compressin which has been accepted by NeurIPS 2024☆57Updated 7 months ago
- Scaling RL on advanced reasoning models☆392Updated this week
- GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents☆286Updated this week
- [ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement☆185Updated last year
- Repo of ACL 2025 main Paper "Quantification of Large Language Model Distillation"☆88Updated last month