liangyuwang / zo2Links

ZO2 (Zeroth-Order Offloading): Full Parameter Fine-Tuning 175B LLMs with 18GB GPU Memory

☆148

Alternatives and similar repositories for zo2

Users that are interested in zo2 are comparing it to the libraries listed below

Sorting:

maple-research-lab / SLOT
☆91Updated 3 weeks ago
GAIR-NLP / MAYE
Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme
☆133Updated 3 months ago
lzhxmu / CPPO
CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Models
☆142Updated last month
RUC-GSAI / YuLan-Mini
A highly capable 2.4B lightweight LLM using only 1T pre-training data with all details.
☆195Updated last week
bigai-nlco / TokenSwift
[ICML 2025] |TokenSwift: Lossless Acceleration of Ultra Long Sequence Generation
☆110Updated last month
THUDM / Awesome-Parameter-Efficient-Fine-Tuning-for-Foundation-Models
Parameter-Efficient Fine-Tuning for Foundation Models
☆73Updated 3 months ago
RUCAIBox / Virgo
Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*
☆105Updated last month
SuperGPQA / SuperGPQA
☆154Updated 2 months ago
step-law / steplaw
☆193Updated 2 months ago
Tencent / llm.hunyuan.T1
☆77Updated 3 months ago
Chen-GX / C-3PO
[ICML2025] The official implementation of "C-3PO: Compact Plug-and-Play Proxy Optimization to Achieve Human-like Retrieval-Augmented Gene…
☆31Updated 2 months ago
MiniMax-AI / One-RL-to-See-Them-All
The official repo of One RL to See Them All: Visual Triple Unified Reinforcement Learning
☆287Updated last month
dongguanting / Tool-Star
Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning
☆197Updated last week
LengSicong / MMR1
MMR1: Advancing the Frontiers of Multimodal Reasoning
☆162Updated 3 months ago
pprp / Awesome-Efficient-MoE
Efficient Mixture of Experts for LLM Paper List
☆79Updated 6 months ago
yfzhang114 / r1_reward
✨✨R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning
☆153Updated 2 months ago
yafuly / TPO
Test-time preferenece optimization (ICML 2025).
☆146Updated 2 months ago
EvolvingLMMs-Lab / multimodal-search-r1
MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search too…
☆241Updated last week
horseee / CoT-Valve
CoT-Valve: Length-Compressible Chain-of-Thought Tuning
☆76Updated 5 months ago
MingyuJ666 / Disentangling-Memory-and-Reasoning
[ACL'25] We propose a novel fine-tuning method, Separate Memory and Reasoning, which combines prompt tuning with LoRA.
☆67Updated last month
InternLM / OREAL
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning
☆188Updated 3 months ago
testtimescaling / testtimescaling.github.io
"what, how, where, and how well? a survey on test-time scaling in large language models" repository
☆51Updated this week
Chongjie-Si / Subspace-Tuning
A generalized framework for subspace tuning methods in parameter efficient fine-tuning.
☆147Updated 2 weeks ago
InternLM / Condor
[ACL 2025] An official pytorch implement of the paper: Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement
☆30Updated last month
deepglint / UniME
[ACM MM25] The official code of "Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs"
☆80Updated last week
THU-KEG / AdaptThink
☆132Updated last month
OpenBMB / RLPR
Extrapolating RLVR to General Domains without Verifiers
☆112Updated 2 weeks ago
Wangmerlyn / MCTS-GSM8k-Demo
This is a repo for showcasing using MCTS with LLMs to solve gsm8k problems
☆85Updated 3 months ago
yyht / openrlhf_async_pipline
☆59Updated 3 weeks ago
IAAR-Shanghai / xVerify
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations
☆119Updated 2 months ago