kingoflolz / mesh-transformer-jaxLinks
Model parallel transformers in JAX and Haiku
☆6,347Updated 2 years ago
Alternatives and similar repositories for mesh-transformer-jax
Users that are interested in mesh-transformer-jax are comparing it to the libraries listed below
Sorting:
- An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries☆7,269Updated last week
- An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.☆8,297Updated 3 years ago
- Repo for external large-scale work☆6,529Updated last year
- CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.☆5,114Updated 5 months ago
- ☆1,584Updated 2 years ago
- Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM☆7,855Updated 3 months ago
- 🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading☆9,731Updated 10 months ago
- ☆2,849Updated last month
- Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch☆5,619Updated last year
- High-speed download of LLaMA, Facebook's 65B parameter GPT model☆4,162Updated 2 years ago
- A collection of libraries to optimise AI model performances☆8,375Updated last year
- A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)☆4,683Updated last year
- The hub for EleutherAI's work on interpretability and learning dynamics☆2,574Updated last month
- Model API for GALACTICA☆2,733Updated 2 years ago
- Large-scale pretrained models for goal-directed dialog☆874Updated last year
- ☆1,042Updated 3 years ago
- Guide to using pre-trained large language models of source code☆1,832Updated last year
- Training and serving large-scale neural networks with auto parallelization.☆3,141Updated last year
- A robust Python tool for text-based AI training and generation using GPT-2.☆1,844Updated 2 years ago
- min(DALL·E) is a fast, minimal port of DALL·E Mini to PyTorch☆3,490Updated 3 months ago
- ☆2,154Updated last year
- Code for GPT-4chan☆635Updated 3 years ago
- ☆1,709Updated 2 years ago
- GLIDE: a diffusion-based text-conditional image synthesis model☆3,650Updated last year
- Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch☆11,299Updated last year
- Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.☆2,227Updated last week
- Running large language models on a single GPU for throughput-oriented scenarios.☆9,349Updated 9 months ago
- API for the GPT-J language model 🦜. Including a FastAPI backend and a streamlit frontend☆336Updated 3 years ago
- Large-scale pretraining for dialogue☆2,396Updated 2 years ago
- Code Generation using GPT-J!☆515Updated 3 years ago