hpcaitech / PaLM-colossalaiLinks

Scalable PaLM implementation of PyTorch

☆188

Alternatives and similar repositories for PaLM-colossalai

Users that are interested in PaLM-colossalai are comparing it to the libraries listed below

Sorting:

AI21Labs / Parallel-Context-Windows
☆105Updated 2 years ago
Dahoas / reward-modeling
☆98Updated 2 years ago
LAION-AI / Open-Instruction-Generalist
Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks
☆209Updated last year
hpcaitech / ColossalAI-Examples
Examples of training models with hybrid parallelism using ColossalAI
☆339Updated 2 years ago
princeton-nlp / CoFiPruning
[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408
☆197Updated 2 years ago
HuangLK / transpeeder
train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism
☆224Updated last year
Lightning-Universe / lightning-ColossalAI
Large Scale Distributed Model Training strategy with Colossal AI and Lightning AI
☆56Updated 2 years ago
qhliu26 / Dive-into-Big-Model-Training
📑 Dive into Big Model Training
☆114Updated 2 years ago
bigscience-workshop / data-preparation
Code used for sourcing and cleaning the BigScience ROOTS corpus
☆316Updated 2 years ago
huggingface / transformers-bloom-inference
Fast Inference Solutions for BLOOM
☆565Updated last year
THUDM / icetk
A unified tokenization tool for Images, Chinese and English.
☆151Updated 2 years ago
CoinCheung / gdGPT
Train llm (bloom, llama, baichuan2-7b, chatglm3-6b) with deepspeed pipeline mode. Faster than zero/zero++/fsdp.
☆98Updated last year
zsc / llama_infer
Inference script for Meta's LLaMA models using Hugging Face wrapper
☆109Updated 2 years ago
anyscale / llm-continuous-batching-benchmarks
☆121Updated last year
hpcaitech / Titans
A collection of models built with ColossalAI
☆32Updated 2 years ago
Langboat / mengzi-retrieval-lm
An experimental implementation of the retrieval-enhanced language model
☆75Updated 2 years ago
ProjectD-AI / LLaMA-Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
☆68Updated 2 years ago
jaymody / speculative-sampling
Simple implementation of Speculative Sampling in NumPy for GPT-2.
☆98Updated 2 years ago
hpcaitech / ColossalAI-Pytorch-lightning
☆24Updated 2 years ago
shjwudp / c4-dataset-script
Inspired by google c4, here is a series of colossal clean data cleaning scripts focused on CommonCrawl data processing. Including Chinese…
☆131Updated 2 years ago
SimiaoZuo / MoEBERT
This PyTorch package implements MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation (NAACL 2022).
☆112Updated 3 years ago
RulinShao / LightSeq
Official repository for DistFlashAttn: Distributed Memory-efficient Attention for Long-context LLMs Training
☆216Updated last year
thomfoster / minRLHF
A (somewhat) minimal library for finetuning language models with PPO on human feedback.
☆87Updated 2 years ago
dropreg / efficient_alpaca
The aim of this repository is to utilize LLaMA to reproduce and enhance the Stanford Alpaca
☆98Updated 2 years ago
princeton-nlp / DinkyTrain
Princeton NLP's pre-training library based on fairseq with DeepSpeed kernel integration 🚃
☆114Updated 3 years ago
p-lambda / dsir
DSIR large-scale data selection framework for language model training
☆265Updated last year
imoneoi / multipack
Multipack distributed sampler for fast padding-free training of LLMs
☆201Updated last year
SeanNaren / min-LLM
Minimal code to train a Large Language Model (LLM).
☆172Updated 3 years ago
SeanNaren / minGPT
A minimal PyTorch Lightning OpenAI GPT w DeepSpeed Training!
☆113Updated 2 years ago
teelinsan / parallel-decoding
Repository of the paper "Accelerating Transformer Inference for Translation via Parallel Decoding"
☆120Updated last year