kingoflolz / mesh-transformer-jaxLinks

Model parallel transformers in JAX and Haiku

☆6,354

Alternatives and similar repositories for mesh-transformer-jax

Users that are interested in mesh-transformer-jax are comparing it to the libraries listed below

Sorting:

EleutherAI / gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
☆7,337Updated last month
EleutherAI / gpt-neo
An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.
☆8,288Updated 3 years ago
facebookresearch / metaseq
Repo for external large-scale work
☆6,546Updated last year
salesforce / CodeGen
CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.
☆5,148Updated 3 weeks ago
EleutherAI / the-pile
☆1,611Updated 2 years ago
minimaxir / aitextgen
A robust Python tool for text-based AI training and generation using GPT-2.
☆1,839Updated 2 years ago
google-research / t5x
☆2,907Updated last week
minimaxir / gpt-2-simple
Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts
☆3,404Updated 2 years ago
openlm-research / open_llama
OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
☆7,524Updated 2 years ago
bigscience-workshop / petals
🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
☆9,832Updated last year
FMInference / FlexLLMGen
Running large language models on a single GPU for throughput-oriented scenarios.
☆9,377Updated last year
CarperAI / trlx
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
☆4,728Updated last year
togethercomputer / RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
☆4,849Updated 11 months ago
Lightning-AI / lit-llama
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Ad…
☆6,087Updated 4 months ago
nebuly-ai / optimate
A collection of libraries to optimise AI model performances
☆8,366Updated last year
lucidrains / DALLE-pytorch
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
☆5,630Updated last year
lucidrains / imagen-pytorch
Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch
☆8,390Updated last year
microsoft / DialoGPT
Large-scale pretraining for dialogue
☆2,411Updated 3 years ago
shawwn / llama-dl
High-speed download of LLaMA, Facebook's 65B parameter GPT model
☆4,154Updated 2 years ago
google / BIG-bench
Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models
☆3,144Updated last year
alpa-projects / alpa
Training and serving large-scale neural networks with auto parallelization.
☆3,164Updated last year
microsoft / GODEL
Large-scale pretrained models for goal-directed dialog
☆885Updated last year
kuprel / min-dalle
min(DALL·E) is a fast, minimal port of DALL·E Mini to PyTorch
☆3,491Updated 6 months ago
stochasticai / xTuring
Build, personalize and control your own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-s…
☆2,662Updated last week
lucidrains / PaLM-rlhf-pytorch
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
☆7,872Updated last month
elyase / awesome-gpt3
☆4,555Updated 2 years ago
yk / gpt-4chan-public
Code for GPT-4chan
☆634Updated 3 years ago
openai / gpt-2-output-dataset
Dataset of GPT-2 outputs for research in detection, biases, and more
☆2,001Updated last year
skypilot-org / skypilot
Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 17+ clouds, o…
☆8,948Updated this week
google-research / text-to-text-transfer-transformer
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
☆6,450Updated last week