QwenLM / QwQLinks
QwQ is the reasoning model series developed by Qwen team, Alibaba Cloud.
☆522Updated 6 months ago
Alternatives and similar repositories for QwQ
Users that are interested in QwQ are comparing it to the libraries listed below
Sorting:
- Train your Agent model via our easy and efficient framework☆1,514Updated last week
- ☆312Updated 3 weeks ago
- This repository introduce a comprehensive paper list, datasets, methods and tools for memory research.☆283Updated 3 months ago
- ☆453Updated last week
- Think Beyond Images☆492Updated this week
- adds Sequence Parallelism into LLaMA-Factory☆561Updated last week
- Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling☆443Updated 4 months ago
- Moxin is a family of fully open-source and reproducible LLMs☆614Updated 2 months ago
- UI-Venus is a native UI agent designed to perform precise GUI element grounding and effective navigation using only screenshots as input.☆484Updated last month
- ☆867Updated this week
- [COLM'25] DeepRetrieval - 🔥 Training Search Agent with Retrieval Outcomes via Reinforcement Learning☆635Updated 3 months ago
- ☆815Updated 3 months ago
- verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in…☆905Updated this week
- ☆82Updated 5 months ago
- A scalable, end-to-end training pipeline for general-purpose agents☆359Updated 2 months ago
- The official implementation of the ICML 2024 paper "MemoryLLM: Towards Self-Updatable Large Language Models" and "M+: Extending MemoryLLM…☆220Updated 2 months ago
- Ling is a MoE LLM provided and open-sourced by InclusionAI.☆203Updated 4 months ago
- Seed-Coder is a family of lightweight open-source code LLMs comprising base, instruct and reasoning models, developed by ByteDance Seed.☆549Updated 3 months ago
- ☆292Updated 3 months ago
- The official implementation of Self-Play Preference Optimization (SPPO)☆582Updated 8 months ago
- minimal-cost for training 0.5B R1-Zero☆769Updated 4 months ago
- A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.☆689Updated last month
- Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL.☆432Updated 2 weeks ago
- MiroThinker is open-source agentic models trained for deep research and complex tool use scenarios.☆359Updated this week
- Scaling RL on advanced reasoning models☆590Updated last month
- ✨ A synthetic dataset generation framework that produces diverse coding questions and verifiable solutions - all in one framwork☆274Updated 3 weeks ago
- [NeurIPS 2024] BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models☆269Updated 6 months ago
- Easy Data Preparation with latest LLMs-based Operators and Pipelines.☆1,331Updated this week
- ☆422Updated this week
- ☆1,126Updated this week