archinetai / difformer-pytorch
Diffusion based transformer, in PyTorch (Experimental).
☆24Updated 2 years ago
Alternatives and similar repositories for difformer-pytorch:
Users that are interested in difformer-pytorch are comparing it to the libraries listed below
- Official implementation for the paper "A Cheaper and Better Diffusion Language Model with Soft-Masked Noise"☆53Updated last year
- Curse-of-memory phenomenon of RNNs in sequence modelling☆19Updated this week
- [ICML 2022] Latent Diffusion Energy-Based Model for Interpretable Text Modeling☆64Updated 2 years ago
- Official code implementation for the work Preference Alignment with Flow Matching (NeurIPS 2024)☆20Updated 2 months ago
- ☆13Updated last year
- Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch☆37Updated 2 years ago
- Position Prediction as an Effective Pretraining Strategy☆8Updated last year
- Un-*** 50 billions multimodality dataset☆24Updated 2 years ago
- Implementation of Retrieval-Augmented Denoising Diffusion Probabilistic Models in Pytorch☆64Updated 2 years ago
- ☆36Updated 5 months ago
- Code for the paper PermuteFormer☆42Updated 3 years ago
- Exploration into the Scaling Value Iteration Networks paper, from Schmidhuber's group☆36Updated 3 months ago
- Official code for the paper: "Metadata Archaeology"☆18Updated last year
- The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns …☆16Updated last year
- Open source community's implementation of the model from "LANGUAGE MODEL BEATS DIFFUSION — TOKENIZER IS KEY TO VISUAL GENERATION"☆15Updated 2 months ago
- Implementation of Metaformer, but in an autoregressive manner☆23Updated 2 years ago
- ☆49Updated 2 years ago
- Code for the PAPA paper☆27Updated 2 years ago
- Official implementation of the paper "Provable Stochastic Optimization for Global Contrastive Learning: Small Batch Does Not Harm Perform…☆20Updated last year
- ☆16Updated 6 months ago
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing☆47Updated 2 years ago
- Unofficial PyTorch implementation of "Step-unrolled Denoising Autoencoders for Text Generation"☆23Updated 2 years ago
- Official Code for ICLR 2024 Paper: Non-negative Contrastive Learning☆38Updated 9 months ago
- Towards Understanding the Mixture-of-Experts Layer in Deep Learning☆22Updated last year
- An implementation of (Induced) Set Attention Block, from the Set Transformers paper☆56Updated 2 years ago
- Implementation of Memformer, a Memory-augmented Transformer, in Pytorch☆107Updated 4 years ago
- ☆51Updated 7 months ago
- Implementation of some personal helper functions for Einops, my most favorite tensor manipulation library ❤️☆53Updated 2 years ago