lucidrains/ponder-transformer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/lucidrains/ponder-transformer)

lucidrains / ponder-transformer

Implementation of a Transformer that Ponders, using the scheme from the PonderNet paper

☆84

Alternatives and similar repositories for ponder-transformer

Users that are interested in ponder-transformer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

lucidrains / token-shift-gpt
View on GitHub
Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing
☆49Jan 27, 2022Updated 4 years ago
lucidrains / multistream-transformers
View on GitHub
Implementation of Multistream Transformers in Pytorch
☆54Jul 31, 2021Updated 4 years ago
lucidrains / triton-transformer
View on GitHub
Implementation of a Transformer, but completely in Triton
☆279Apr 5, 2022Updated 4 years ago
lucidrains / rela-transformer
View on GitHub
Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012
☆49Apr 6, 2022Updated 4 years ago
lucidrains / einops-exts
View on GitHub
Implementation of some personal helper functions for Einops, my most favorite tensor manipulation library ❤️
☆57Jan 5, 2023Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
lucidrains / g-mlp-gpt
View on GitHub
GPT, but made only out of MLPs
☆89May 25, 2021Updated 5 years ago
lucidrains / n-grammer-pytorch
View on GitHub
Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch
☆81Dec 4, 2022Updated 3 years ago
lucidrains / isab-pytorch
View on GitHub
An implementation of (Induced) Set Attention Block, from the Set Transformers paper
☆70Jun 8, 2026Updated last month
lucidrains / all-normalization-transformer
View on GitHub
A simple Transformer where the softmax has been replaced with normalization
☆20Sep 11, 2020Updated 5 years ago
lucidrains / feedback-transformer-pytorch
View on GitHub
Implementation of Feedback Transformer in Pytorch
☆108Mar 2, 2021Updated 5 years ago
lucidrains / hourglass-transformer-pytorch
View on GitHub
Implementation of Hourglass Transformer, in Pytorch, from Google and OpenAI
☆99Dec 31, 2021Updated 4 years ago
lucidrains / metaformer-gpt
View on GitHub
Implementation of Metaformer, but in an autoregressive manner
☆26Jun 21, 2022Updated 4 years ago
louiskirsch / vsml-neurips2021
View on GitHub
Code for "Meta Learning Backpropagation And Improving It" @ NeurIPS 2021 https://arxiv.org/abs/2012.14905
☆33Jan 9, 2022Updated 4 years ago
lucidrains / panoptic-transformer
View on GitHub
Another attempt at a long-context / efficient transformer by me
☆38Apr 11, 2022Updated 4 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
CompVis / visual-search
View on GitHub
Visual search interface
☆11Nov 30, 2021Updated 4 years ago
lucidrains / long-short-transformer
View on GitHub
Implementation of Long-Short Transformer, combining local and global inductive biases for attention over long sequences, in Pytorch
☆120Aug 4, 2021Updated 4 years ago
lucidrains / esbn-transformer
View on GitHub
An attempt to merge ESBN with Transformers, to endow Transformers with the ability to emergently bind symbols
☆16Aug 3, 2021Updated 4 years ago
antofuller / configaformers
View on GitHub
A python library for highly configurable transformers - easing model architecture search and experimentation.
☆48Nov 30, 2021Updated 4 years ago
RobertCsordas / ndr
View on GitHub
The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".
☆34Jun 11, 2025Updated last year
rovle / gpt3-in-context-fitting
View on GitHub
Experiments on GPT-3's ability to fit numerical models in-context.
☆14Aug 11, 2022Updated 3 years ago
lucidrains / mlp-gpt-jax
View on GitHub
A GPT, made only of MLPs, in Jax
☆59Jun 23, 2021Updated 5 years ago
lucidrains / scaling-vin-pytorch
View on GitHub
Exploration into the Scaling Value Iteration Networks paper, from Schmidhuber's group
☆37Sep 23, 2024Updated last year
lucidrains / flash-cosine-sim-attention
View on GitHub
Implementation of fused cosine similarity attention in the same style as Flash Attention
☆220Feb 13, 2023Updated 3 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
lucidrains / fast-transformer-pytorch
View on GitHub
Implementation of Fast Transformer in Pytorch
☆176Aug 26, 2021Updated 4 years ago
lucidrains / jax2torch
View on GitHub
Use Jax functions in Pytorch
☆263Jul 1, 2023Updated 3 years ago
lucidrains / tranception-pytorch
View on GitHub
Implementation of Tranception, an attention network, paired with retrieval, that is SOTA for protein fitness prediction
☆32Jun 19, 2022Updated 4 years ago
lucidrains / omninet-pytorch
View on GitHub
Implementation of OmniNet, Omnidirectional Representations from Transformers, in Pytorch
☆59Mar 19, 2021Updated 5 years ago
j-towns / vdvae-jax
View on GitHub
Very deep VAEs in JAX/Flax
☆47Jun 16, 2021Updated 5 years ago
lucidrains / glom-pytorch
View on GitHub
An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates concepts from neural fields, top-down-bottom-up proc…
☆196Mar 27, 2021Updated 5 years ago
lucidrains / self-reasoning-tokens-pytorch
View on GitHub
Exploration into the proposed "Self Reasoning Tokens" by Felipe Bonetto
☆57May 17, 2024Updated 2 years ago
lucidrains / charformer-pytorch
View on GitHub
Implementation of the GBST block from the Charformer paper, in Pytorch
☆118Jul 15, 2021Updated 5 years ago
neale / avoiding-side-effects
View on GitHub
Code for reproducing the results from the paper Avoiding Side Effects in Complex Environments
☆12Jun 3, 2021Updated 5 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
lucidrains / perceiver-pytorch
View on GitHub
Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch
☆1,217Jun 8, 2026Updated last month
lucidrains / PaLM-jax
View on GitHub
Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways - in Jax (Equinox framework)
☆189Jun 24, 2022Updated 4 years ago
jeongukjae / namuwiki-corpus
View on GitHub
문장단위로 분절된 나무위키 데이터셋. Releases에서 다운로드 받거나, tfds-korean을 통해 다운로드 받으세요.
☆19Jun 16, 2021Updated 5 years ago
btma48 / AutoLA
View on GitHub
Code of our Neurips2020 paper "Auto Learning Attention", coming soon
☆22Apr 14, 2021Updated 5 years ago
lucidrains / Adan-pytorch
View on GitHub
Implementation of the Adan (ADAptive Nesterov momentum algorithm) Optimizer in Pytorch
☆251Sep 1, 2022Updated 3 years ago
lucidrains / tableformer-pytorch
View on GitHub
Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch
☆39Mar 29, 2022Updated 4 years ago
Enealor / PyTorch-SM3
View on GitHub
Implements the SM3-II adaptive optimization algorithm for PyTorch.
☆33Sep 3, 2024Updated last year