hpcaitech / ColossalAI-Examples
Examples of training models with hybrid parallelism using ColossalAI
☆334Updated last year
Related projects: ⓘ
- Scalable PaLM implementation of PyTorch☆191Updated last year
- Large-scale model inference.☆630Updated last year
- LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training☆389Updated this week
- Performance benchmarking with ColossalAI☆38Updated 2 years ago
- Fast Inference Solutions for BLOOM☆556Updated last month
- Efficient Training (including pre-training and fine-tuning) for Big Models☆548Updated last month
- Efficient Inference for Big Models☆573Updated last year
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆1,317Updated 6 months ago
- Model Compression for Big Models☆151Updated last year
- [NIPS2023] RRHF & Wombat☆789Updated 11 months ago
- Best practice for training LLaMA models in Megatron-LM☆606Updated 8 months ago
- The CUDA version of the RWKV language model ( https://github.com/BlinkDL/RWKV-LM )☆208Updated 4 months ago
- Collaborative Training of Large Language Models in an Efficient Way☆405Updated 3 weeks ago
- ☆447Updated 3 months ago
- USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference☆309Updated this week
- FlagScale is a large model toolkit based on open-sourced projects.☆129Updated last week
- train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism☆207Updated 10 months ago
- Tutel MoE: An Optimized Mixture-of-Experts Implementation☆711Updated last week
- Code used for sourcing and cleaning the BigScience ROOTS corpus☆299Updated last year
- Crosslingual Generalization through Multitask Finetuning☆510Updated last year
- [NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.☆360Updated last month
- SwissArmyTransformer is a flexible and powerful library to develop your own Transformer variants.☆955Updated 3 weeks ago
- Dive into Big Model Training☆109Updated last year
- Microsoft Automatic Mixed Precision Library☆507Updated this week
- Efficient, Low-Resource, Distributed transformer implementation based on BMTrain☆233Updated 9 months ago
- Rectified Rotary Position Embeddings☆329Updated 4 months ago
- Implementation of Chinese ChatGPT☆282Updated 10 months ago
- Code for the ALiBi method for transformer language models (ICLR 2022)☆499Updated 10 months ago
- ParaGen is a PyTorch deep learning framework for parallel sequence generation.☆186Updated last year
- Running BERT without Padding☆456Updated 2 years ago