mst272 / simple-lora-plusLinks
A simple implementation of LoRA+: Efficient Low Rank Adaptation of Large Models
☆9Updated last year
Alternatives and similar repositories for simple-lora-plus
Users that are interested in simple-lora-plus are comparing it to the libraries listed below
Sorting:
- Deepseek-r1复现科普与资源汇总☆21Updated 3 months ago
- Pretrain、decay、SFT a CodeLLM from scratch 🧙♂️☆36Updated last year
- ☆15Updated last year
- The official repository for the Scientific Paper Idea Proposer (SciPIP)☆62Updated 4 months ago
- This is a repo for showcasing using MCTS with LLMs to solve gsm8k problems☆84Updated 3 months ago
- This is the code repo for the paper "Learning to Route Queries Across Knowledge Bases for Step-wise Retrieval-Augmented Reasoning".☆16Updated last month
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆32Updated last year
- Adapt an LLM model to a Mixture-of-Experts model using Parameter Efficient finetuning (LoRA), injecting the LoRAs in the FFN.☆42Updated 8 months ago
- 本项目提供了基于910B的huggingface LLM模型的Tensor Parallel(TP)部署教程,同时也可以作为一份极简的TP学习代码。☆25Updated 9 months ago
- llm & rl☆151Updated this week
- ZO2 (Zeroth-Order Offloading): Full Parameter Fine-Tuning 175B LLMs with 18GB GPU Memory☆128Updated last week
- ☆116Updated last month
- [ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".☆122Updated 7 months ago
- The code for "AttentionPredictor: Temporal Pattern Matters for Efficient LLM Inference", Qingyue Yang, Jie Wang, Xing Li, Zhihai Wang, Ch…☆18Updated last month
- Due to the huge vocaburary size (151,936) of Qwen models, the Embedding and LM Head weights are excessively heavy. Therefore, this projec…☆22Updated 10 months ago
- Parameter-Efficient Fine-Tuning for Foundation Models☆69Updated 2 months ago
- ☆201Updated 8 months ago
- [SIGIR'24] The official implementation code of MOELoRA.☆168Updated 11 months ago
- Inference Code for Paper "Harder Tasks Need More Experts: Dynamic Routing in MoE Models"☆53Updated 10 months ago
- Scripts of LLM pre-training and fine-tuning (w/wo LoRA, DeepSpeed)☆80Updated last year
- MetaSearch:llm深度研究(deepsearch)功能方案实现☆30Updated 2 months ago
- RLHF experiments on a single A100 40G GPU. Support PPO, GRPO, REINFORCE, RAFT, RLOO, ReMax, DeepSeek R1-Zero reproducing.☆64Updated 4 months ago
- Train your grpo with zero dataset and low resources, 8bit/4bit/lora/qlora supported, multi-gpu supported ...☆73Updated last month
- An Awesome List of Reinforcement Learning-based Large Language Agent Works. Collect directly from official code base.☆154Updated this week
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆46Updated last month
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆73Updated 4 months ago
- ☆16Updated this week
- ☆15Updated 8 months ago
- The code and data of DPA-RAG, accepted by WWW 2025 main conference.☆61Updated 5 months ago
- Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆75Updated 3 weeks ago