Ledzy / BAdamLinks

[NeurIPS 2024] BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models

☆275

Alternatives and similar repositories for BAdam

Users that are interested in BAdam are comparing it to the libraries listed below

Sorting:

RLHFlow / Minimal-RL
☆248Updated 6 months ago
RLHFlow / Online-RLHF
A recipe for online RLHF and online iterative DPO.
☆536Updated 10 months ago
cmriat / l0
A scalable, end-to-end training pipeline for general-purpose agents
☆361Updated 4 months ago
RLHFlow / Online-DPO-R1
Codebase for Iterative DPO Using Rule-based Rewards
☆261Updated 7 months ago
meta-math / MetaMath
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
☆451Updated last year
Qihoo360 / 360-LLaMA-Factory
adds Sequence Parallelism into LLaMA-Factory
☆591Updated last month
InternLM / InternBootcamp
☆323Updated 2 months ago
IAAR-Shanghai / CTGSurvey
Controllable Text Generation for Large Language Models: A Survey
☆193Updated last year
uclaml / SPPO
The official implementation of Self-Play Preference Optimization (SPPO)
☆582Updated 9 months ago
jordddan / Pruning-LLMs
The framework to prune LLMs to any size and any config.
☆94Updated last year
IAAR-Shanghai / UHGEval
[ACL 2024] User-friendly evaluation framework: Eval Suite & Benchmarks: UHGEval, HaluEval, HalluQA, etc.
☆178Updated 5 months ago
RLHFlow / Self-rewarding-reasoning-LLM
Recipes to train the self-rewarding reasoning LLMs.
☆227Updated 8 months ago
step-law / steplaw
☆205Updated 3 weeks ago
AIoT-MLSys-Lab / SVD-LLM
[ICLR 2025🔥] SVD-LLM & [NAACL 2025🔥] SVD-LLM V2
☆261Updated 2 months ago
zhuhanqing / APOLLO
APOLLO: SGD-like Memory, AdamW-level Performance; MLSys'25 Oustanding Paper Honorable Mention
☆258Updated 6 months ago
Outsider565 / LoRA-GA
☆213Updated last year
OS-Agent-Survey / OS-Agent-Survey
This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use" (ACL 2025 Oral).
☆365Updated 3 months ago
KodCode-AI / kodcode
✨ A synthetic dataset generation framework that produces diverse coding questions and verifiable solutions - all in one framwork
☆289Updated 2 months ago
OPPO-PersonalAI / TaskCraft
A library for generating difficulty-scalable, multi-tool, and verifiable agentic tasks with execution trajectories.
☆167Updated 4 months ago
IAAR-Shanghai / ICSFSurvey
Explore concepts like Self-Correct, Self-Refine, Self-Improve, Self-Contradict, Self-Play, and Self-Knowledge, alongside o1-like reasonin…
☆170Updated 11 months ago
wangyu-ustc / MemoryLLM
The official implementation of the ICML 2024 paper "MemoryLLM: Towards Self-Updatable Large Language Models" and "M+: Extending MemoryLLM…
☆255Updated 3 months ago
Infini-AI-Lab / TriForce
[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
☆270Updated last year
dhcode-cpp / X-R1
minimal-cost for training 0.5B R1-Zero
☆785Updated 6 months ago
yongliang-wu / DFT
[Preprint] On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification.
☆498Updated 2 weeks ago
RLHFlow / RLHF-Reward-Modeling
Recipes to train reward model for RLHF.
☆1,480Updated 6 months ago
modelscope / Trinity-RFT
Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (…
☆404Updated last week
HKUDS / SepLLM
[ICML 2025] "SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator"
☆555Updated 3 months ago
yfzhang114 / r1_reward
✨✨R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning
☆270Updated 6 months ago
TemporaryLoRA / Temp-LoRA
☆119Updated last year
Tencent / CognitiveKernel-Pro
Deep Research Agent CognitiveKernel-Pro from Tencent AI Lab. Paper: https://arxiv.org/pdf/2508.00414
☆459Updated last month