alibaba / Pai-Megatron-Patch
The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
☆856Updated last week
Alternatives and similar repositories for Pai-Megatron-Patch:
Users that are interested in Pai-Megatron-Patch are comparing it to the libraries listed below
- Best practice for training LLaMA models in Megatron-LM☆644Updated last year
- RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.☆629Updated last month
- FlagScale is a large model toolkit based on open-sourced projects.☆223Updated this week
- Efficient Training (including pre-training and fine-tuning) for Big Models☆577Updated 7 months ago
- A flexible and efficient training framework for large-scale alignment tasks☆304Updated last week
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆1,990Updated 2 weeks ago
- Fast inference from large lauguage models via speculative decoding☆661Updated 6 months ago
- USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference☆428Updated this week
- LongBench v2 and LongBench (ACL 2024)☆782Updated last month
- A streamlined and customizable framework for efficient large model evaluation and performance benchmarking☆442Updated this week
- ⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)☆922Updated 2 months ago
- ☆314Updated last month
- Collaborative Training of Large Language Models in an Efficient Way☆411Updated 5 months ago
- InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencie…☆348Updated this week
- 大模型多维度中文对齐评测基准 (ACL 2024)☆359Updated 6 months ago
- ☆318Updated 7 months ago
- CMMLU: Measuring massive multitask language understanding in Chinese☆726Updated 2 months ago
- 开源SFT数据集整理,随时补充☆485Updated last year
- Train a 1B LLM with 1T tokens from scratch by personal☆537Updated this week
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆1,364Updated 11 months ago
- Disaggregated serving system for Large Language Models (LLMs).☆468Updated 6 months ago
- FlashInfer: Kernel Library for LLM Serving☆2,111Updated this week
- Official Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)☆968Updated this week
- 欢迎来到 "LLM-travel" 仓库!探索大语言模型(LLM)的奥秘 🚀。致力于深入理解、探讨以及实现与大模型相关的各种技术、原理和应用。☆298Updated 7 months ago
- ☆306Updated 7 months ago
- A PyTorch Native LLM Training Framework☆732Updated last month
- ☆153Updated this week
- train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism☆215Updated last year
- 📰 Must-read papers and blogs on Speculative Decoding ⚡️☆597Updated this week
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆231Updated last week