alibaba / Pai-Megatron-Patch
The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
☆946Updated this week
Alternatives and similar repositories for Pai-Megatron-Patch:
Users that are interested in Pai-Megatron-Patch are comparing it to the libraries listed below
- Best practice for training LLaMA models in Megatron-LM☆645Updated last year
- FlagScale is a large model toolkit based on open-sourced projects.☆253Updated this week
- Community maintained hardware plugin for vLLM on Ascend☆370Updated this week
- RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.☆670Updated 2 months ago
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆2,022Updated last month
- A streamlined and customizable framework for efficient large model evaluation and performance benchmarking☆647Updated this week
- A flexible and efficient training framework for large-scale alignment tasks☆333Updated last month
- ⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)☆936Updated 3 months ago
- Efficient Training (including pre-training and fine-tuning) for Big Models☆580Updated 8 months ago
- USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference☆452Updated this week
- LongBench v2 and LongBench (ACL 2024)☆815Updated 2 months ago
- Fast inference from large lauguage models via speculative decoding☆692Updated 7 months ago
- InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencie…☆368Updated this week
- Ring attention implementation with flash attention☆714Updated last month
- A PyTorch Native LLM Training Framework☆759Updated 2 months ago
- Disaggregated serving system for Large Language Models (LLMs).☆507Updated 7 months ago
- Train a 1B LLM with 1T tokens from scratch by personal☆583Updated 2 weeks ago
- 开源SFT数据集整理,随时补充☆500Updated last year
- EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL☆1,681Updated this week
- ☆318Updated 8 months ago
- Collaborative Training of Large Language Models in an Efficient Way☆413Updated 6 months ago
- Reproduce R1 Zero on Logic Puzzle☆2,208Updated this week
- 欢迎来到 "LLM-travel" 仓库!探索大语言模型(LLM)的奥秘 🚀。致力于深入理解、探讨以及实现与大模型相关的各种技术、原理和应用。☆303Updated 8 months ago
- LLM Inference benchmark☆404Updated 8 months ago
- Accelerate inference without tears☆306Updated last week
- 一种任务级GPU算力分时调度的高性能深度学习训练平台☆601Updated last year
- FlagEval is an evaluation toolkit for AI large foundation models.☆327Updated 8 months ago
- My learning notes/codes for ML SYS.☆1,481Updated last week
- ☆324Updated 2 months ago
- ☆157Updated this week