TencentARC/LLaMA-Pro

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/TencentARC/LLaMA-Pro)

TencentARC / LLaMA-Pro

[ACL 2024] Progressive LLaMA with Block Expansion.

☆513

Alternatives and similar repositories for LLaMA-Pro

Users that are interested in LLaMA-Pro are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

hills-code / open-instruct
View on GitHub
☆16May 8, 2024Updated 2 years ago
arcee-ai / mergekit
View on GitHub
Tools for merging pretrained large language models.
☆7,246Jun 17, 2026Updated last month
hkust-nlp / deita
View on GitHub
Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]
☆599Dec 9, 2024Updated last year
pjlab-sys4nlp / llama-moe
View on GitHub
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)
☆1,004Dec 6, 2024Updated last year
fanqiwan / FuseAI
View on GitHub
FuseAI Project
☆600Jan 25, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
NVIDIA / NeMo-Aligner
View on GitHub
Scalable toolkit for efficient model alignment
☆850Oct 6, 2025Updated 9 months ago
JIA-Lab-research / LongLoRA
View on GitHub
Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)
☆2,689Aug 14, 2024Updated last year
microsoft / rho
View on GitHub
Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.
☆470Apr 18, 2024Updated 2 years ago
XueFuzhao / OpenMoE
View on GitHub
A family of open-sourced Mixture-of-Experts (MoE) Large Language Models
☆1,691Mar 8, 2024Updated 2 years ago
myshell-ai / JetMoE
View on GitHub
Reaching LLaMA2 Performance with 0.1M Dollars
☆986Jul 23, 2024Updated last year
multimodal-art-projection / MAP-NEO
View on GitHub
☆985Feb 7, 2025Updated last year
FranxYao / Long-Context-Data-Engineering
View on GitHub
Implementation of paper Data Engineering for Scaling Language Models to 128K Context
☆501Mar 19, 2024Updated 2 years ago
yegcjs / mixinglaws
View on GitHub
☆113Jul 15, 2025Updated last year
huggingface / alignment-handbook
View on GitHub
Robust recipes to align language models with human and AI preferences
☆5,639May 26, 2026Updated last month
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
yule-BUAA / MergeLM
View on GitHub
Codebase for Merging Language Models (ICML 2024)
☆869May 5, 2024Updated 2 years ago
jzhang38 / TinyLlama
View on GitHub
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
☆9,014May 3, 2024Updated 2 years ago
uclaml / SPIN
View on GitHub
The official implementation of Self-Play Fine-Tuning (SPIN)
☆1,247May 8, 2024Updated 2 years ago
deepseek-ai / DeepSeek-MoE
View on GitHub
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
☆1,951Jan 16, 2024Updated 2 years ago
allenai / open-instruct
View on GitHub
AllenAI's post-training codebase
☆3,801Updated this week
datamllab / LongLM
View on GitHub
[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
☆668Jun 1, 2024Updated 2 years ago
OpenLMLab / LOMO
View on GitHub
LOMO: LOw-Memory Optimization
☆994Jul 2, 2024Updated 2 years ago
baaivision / Emu
View on GitHub
Emu Series: Generative Multimodal Models from BAAI
☆1,776Jan 12, 2026Updated 6 months ago
jzhang38 / EasyContext
View on GitHub
Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
☆759Sep 27, 2024Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
XuezheMax / megalodon
View on GitHub
Reference implementation of Megalodon 7B model
☆526May 17, 2025Updated last year
ContextualAI / HALOs
View on GitHub
A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).
☆908Sep 30, 2025Updated 9 months ago
VITA-Group / Q-GaLore
View on GitHub
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.
☆206Jul 17, 2024Updated 2 years ago
sail-sg / lorahub
View on GitHub
[COLM 2024] LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition
☆671Jul 22, 2024Updated last year
IEIT-Yuan / Yuan2.0-M32
View on GitHub
Mixture-of-Experts (MoE) Language Model
☆194Sep 9, 2024Updated last year
eddycmu / demystify-long-cot
View on GitHub
☆336May 31, 2025Updated last year
allenai / OLMo
View on GitHub
Modeling, training, eval, and inference code for OLMo
☆6,591Nov 24, 2025Updated 7 months ago
bigcode-project / astraios
View on GitHub
Astraios: Parameter-Efficient Instruction Tuning Code Language Models
☆63Apr 10, 2024Updated 2 years ago
PhoebusSi / Alpaca-CoT
View on GitHub
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tunin…
☆2,791Dec 12, 2023Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
open-compass / MixtralKit
View on GitHub
A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI
☆770Dec 15, 2023Updated 2 years ago
neelsjain / NEFTune
View on GitHub
Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning
☆412May 17, 2024Updated 2 years ago
allenai / dolma
View on GitHub
Data and tools for generating and inspecting OLMo pre-training data.
☆1,526Nov 5, 2025Updated 8 months ago
OpenRLHF / OpenRLHF
View on GitHub
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Asy…
☆9,828Jul 14, 2026Updated last week
princeton-nlp / LLM-Shearing
View on GitHub
[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
☆640Mar 4, 2024Updated 2 years ago
huggingface / nanotron
View on GitHub
Minimalistic large language model 3D-parallelism training
☆2,755May 26, 2026Updated last month
lucidrains / CALM-pytorch
View on GitHub
Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind
☆177Sep 12, 2024Updated last year