lucidrains / PaLM-pytorchView external linksLinks
Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways
☆828Nov 9, 2022Updated 3 years ago
Alternatives and similar repositories for PaLM-pytorch
Users that are interested in PaLM-pytorch are comparing it to the libraries listed below
Sorting:
- Scalable PaLM implementation of PyTorch☆190Dec 19, 2022Updated 3 years ago
- Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch☆879Oct 30, 2023Updated 2 years ago
- Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM☆7,882Oct 11, 2025Updated 4 months ago
- Repo for external large-scale work☆6,544Apr 27, 2024Updated last year
- Parallelformers: An Efficient Model Parallelization Toolkit for Deployment☆791Apr 24, 2023Updated 2 years ago
- A concise but complete implementation of CLIP with various experimental improvements from recent papers☆722Oct 16, 2023Updated 2 years ago
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing☆49Jan 27, 2022Updated 4 years ago
- A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)☆4,742Jan 8, 2024Updated 2 years ago
- Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch☆39Mar 29, 2022Updated 3 years ago
- ☆2,947Jan 15, 2026Updated last month
- Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch☆5,629Feb 17, 2024Updated last year
- Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012☆49Apr 6, 2022Updated 3 years ago
- Transformer related optimization, including BERT, GPT☆6,392Mar 27, 2024Updated last year
- Explorations into training LLMs to use clinical calculators from patient history, using open sourced models. Will start with Wells' Crite…☆316Aug 31, 2025Updated 5 months ago
- PyTorch implementation of a 1.3B text-to-image generation model trained on 14 million image-text pairs☆634Aug 9, 2022Updated 3 years ago
- Open-source pre-training implementation of Google's LaMDA in PyTorch. Adding RLHF similar to ChatGPT.☆470Feb 24, 2024Updated last year
- Official code and model checkpoints for our EMNLP 2022 paper "RankGen - Improving Text Generation with Large Ranking Models" (https://arx…☆137Aug 2, 2023Updated 2 years ago
- ☆1,559Feb 5, 2026Updated last week
- OSLO: Open Source for Large-scale Optimization☆175Sep 9, 2023Updated 2 years ago
- ☆184May 26, 2023Updated 2 years ago
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities☆22,021Jan 23, 2026Updated 3 weeks ago
- Foundation Architecture for (M)LLMs☆3,130Apr 11, 2024Updated last year
- Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch☆1,200Dec 12, 2023Updated 2 years ago
- An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries☆7,382Feb 3, 2026Updated 2 weeks ago
- 🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (i…☆9,491Feb 6, 2026Updated last week
- Reproduce results and replicate training fo T0 (Multitask Prompted Training Enables Zero-Shot Task Generalization)☆465Nov 5, 2022Updated 3 years ago
- Korean Named Entity Corpus☆25May 12, 2023Updated 2 years ago
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆1,434Mar 20, 2024Updated last year
- An open-source implementation of Google's PaLM models☆820Jun 21, 2024Updated last year
- A modular RL library to fine-tune language models to human preferences☆2,377Mar 1, 2024Updated last year
- PyTorch extensions for high performance and large scale training.☆3,397Apr 26, 2025Updated 9 months ago
- Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memories using approximate …☆641Jul 17, 2023Updated 2 years ago
- Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena☆207Aug 26, 2023Updated 2 years ago
- Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀☆1,689Oct 23, 2024Updated last year
- A Unified Library for Parameter-Efficient and Modular Transfer Learning☆2,802Oct 12, 2025Updated 4 months ago
- Large Scale Distributed Model Training strategy with Colossal AI and Lightning AI☆56Sep 1, 2023Updated 2 years ago
- Used for adaptive human in the loop evaluation of language and embedding models.☆308Mar 1, 2023Updated 2 years ago
- An implementation of Performer, a linear attention-based transformer, in Pytorch☆1,172Feb 2, 2022Updated 4 years ago
- Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch☆2,619Jan 12, 2025Updated last year