Ledzy / BAdam
[NeurIPS 2024] BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models
☆249Updated last week
Alternatives and similar repositories for BAdam:
Users that are interested in BAdam are comparing it to the libraries listed below
- A recipe for online RLHF and online iterative DPO.☆502Updated 3 months ago
- The official implementation of Self-Play Preference Optimization (SPPO)☆508Updated 2 months ago
- minimal-cost for training 0.5B R1-Zero☆668Updated 2 weeks ago
- Codebase for Iterative DPO Using Rule-based Rewards☆227Updated last month
- adds Sequence Parallelism into LLaMA-Factory☆432Updated this week
- Recipes to train reward model for RLHF.☆1,257Updated last month
- LoRAMoE: Revolutionizing Mixture of Experts for Maintaining World Knowledge in Language Model Alignment☆311Updated 10 months ago
- MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models☆422Updated last year
- ☆139Updated 2 weeks ago
- Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conv…☆428Updated 2 weeks ago
- ☆182Updated 5 months ago
- The official repo for paper, LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods.☆312Updated 3 months ago
- DeepRetrieval - Hacking 🔥Real Search Engines and Text/Data Retrievers with LLM + RL☆196Updated this week
- Recipes to train the self-rewarding reasoning LLMs.☆207Updated 3 weeks ago
- This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use".☆229Updated last month
- PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models(NeurIPS 2024 Spotlight)☆336Updated last month
- ☆166Updated last month
- A visuailzation tool to make deep understaning and easier debugging for RLHF training.☆177Updated last month
- ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)☆597Updated 2 months ago
- A series of technical report on Slow Thinking with LLM☆595Updated this week
- [ICLR 2025🔥] SVD-LLM & [NAACL 2025🔥] SVD-LLM V2☆181Updated last week
- The official implementation of the ICML 2024 paper "MemoryLLM: Towards Self-Updatable Large Language Models" and "M+: Extending MemoryLLM…☆131Updated last month
- Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition☆282Updated 2 months ago
- ☆504Updated 2 months ago
- Controllable Text Generation for Large Language Models: A Survey☆164Updated 7 months ago
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models☆251Updated 6 months ago
- The nanoGPT-style implementation of RWKV Language Model - an RNN with GPT-level LLM performance.☆185Updated last year
- Explore the Multimodal “Aha Moment” on 2B Model☆524Updated last week
- A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.☆212Updated this week
- Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"☆357Updated 2 months ago