hpcaitech / PaLM-colossalai
Scalable PaLM implementation of PyTorch
☆191Updated 2 years ago
Alternatives and similar repositories for PaLM-colossalai:
Users that are interested in PaLM-colossalai are comparing it to the libraries listed below
- Performance benchmarking with ColossalAI☆39Updated 2 years ago
- Official repository for LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers☆207Updated 7 months ago
- Examples of training models with hybrid parallelism using ColossalAI☆339Updated 2 years ago
- Fast Inference Solutions for BLOOM☆562Updated 5 months ago
- GPTQ inference Triton kernel☆300Updated last year
- Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks☆208Updated last year
- ☆104Updated last year
- ☆96Updated last year
- Microsoft Automatic Mixed Precision Library☆587Updated 6 months ago
- ☆116Updated last year
- Running BERT without Padding☆471Updated 3 years ago
- train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism☆216Updated last year
- 📑 Dive into Big Model Training☆110Updated 2 years ago
- Simple implementation of Speculative Sampling in NumPy for GPT-2.☆92Updated last year
- Official repository for LongChat and LongEval☆515Updated 10 months ago
- A unified tokenization tool for Images, Chinese and English.☆151Updated 2 years ago
- Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718☆313Updated 6 months ago
- Large Scale Distributed Model Training strategy with Colossal AI and Lightning AI☆57Updated last year
- Repository for analysis and experiments in the BigCode project.☆117Updated last year
- DSIR large-scale data selection framework for language model training☆244Updated 11 months ago
- Code used for sourcing and cleaning the BigScience ROOTS corpus☆309Updated 2 years ago
- This is a text generation method which returns a generator, streaming out each token in real-time during inference, based on Huggingface/…☆95Updated last year
- Large-scale model inference.☆628Updated last year
- Techniques used to run BLOOM at inference in parallel☆37Updated 2 years ago
- Train llm (bloom, llama, baichuan2-7b, chatglm3-6b) with deepspeed pipeline mode. Faster than zero/zero++/fsdp.☆95Updated last year
- REST: Retrieval-Based Speculative Decoding, NAACL 2024☆198Updated 4 months ago
- USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference☆463Updated 2 weeks ago
- [NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization☆338Updated 7 months ago
- Scaling Data-Constrained Language Models☆335Updated 6 months ago
- 🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.☆190Updated this week