naver-ai / rope-vit
[ECCV 2024] Official PyTorch implementation of RoPE-ViT "Rotary Position Embedding for Vision Transformer"
☆315Updated 4 months ago
Alternatives and similar repositories for rope-vit:
Users that are interested in rope-vit are comparing it to the libraries listed below
- Open source implementation of "Vision Transformers Need Registers"☆175Updated 2 weeks ago
- When do we not need larger vision models?☆388Updated 2 months ago
- [ICLR2025] Halton Scheduler for Masked Generative Image Transformer☆220Updated 2 weeks ago
- Implementation of Autoregressive Diffusion in Pytorch☆372Updated 5 months ago
- An efficient pytorch implementation of selective scan in one file, works with both cpu and gpu, with corresponding mathematical derivatio…☆85Updated last year
- This repo contains the code for 1D tokenizer and generator☆838Updated last month
- My implementation of "Patch n’ Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution"☆228Updated 3 weeks ago
- This is the official code release for our work, Denoising Vision Transformers.☆360Updated 5 months ago
- Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch☆283Updated 3 weeks ago
- The official repo for [TPAMI'23] "Vision Transformer with Quadrangle Attention"☆208Updated last year
- Masked Diffusion Transformer is the SOTA for image synthesis. (ICCV 2023)☆560Updated last year
- Code for Scaling Language-Free Visual Representation Learning (WebSSL)☆245Updated this week
- High-performance Image Tokenizers for VAR and AR☆247Updated 2 weeks ago
- 1.5−3.0× lossless training or pre-training speedup. An off-the-shelf, easy-to-implement algorithm for the efficient training of foundatio…☆221Updated 8 months ago
- ☆122Updated 9 months ago
- A collection of resources and papers on Vector Quantized Variational Autoencoder (VQ-VAE) and its application☆274Updated 2 months ago
- EVE Series: Encoder-Free Vision-Language Models from BAAI☆322Updated last month
- Implementation of MagViT2 Tokenizer in Pytorch☆600Updated 3 months ago
- [CVPR 2025 Highlight] The official CLIP training codebase of Inf-CL: "Breaking the Memory Barrier: Near Infinite Batch Size Scaling for C…☆240Updated 3 months ago
- Code for Fast Training of Diffusion Models with Masked Transformers☆398Updated 11 months ago
- A PyTorch implementation of the paper "ZigMa: A DiT-Style Mamba-based Diffusion Model" (ECCV 2024)☆304Updated last month
- Implementation of a single layer of the MMDiT, proposed in Stable Diffusion 3, in Pytorch☆342Updated 3 months ago
- Scaling Diffusion Transformers with Mixture of Experts☆311Updated 7 months ago
- Official implementation of "Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers"☆128Updated 2 months ago
- ☆95Updated 3 weeks ago
- Official Open Source code for "Masked Autoencoders As Spatiotemporal Learners"☆336Updated 5 months ago
- This repository includes the official implementation of our paper "Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generat…☆191Updated last month
- [NeurIPS 2024] The official code of "U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers"☆202Updated 6 months ago
- [CVPR'24] Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities☆99Updated last year
- N-dimensional Rotary Position Embeddings for PyTorch☆49Updated last year