naver-ai/rope-vit

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/naver-ai/rope-vit)

naver-ai / rope-vit

[ECCV 2024] Official PyTorch implementation of RoPE-ViT "Rotary Position Embedding for Vision Transformer"

☆467

Alternatives and similar repositories for rope-vit

Users that are interested in rope-vit are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

naver-ai / lut
View on GitHub
[ECCV 2024] Official PyTorch implementation of LUT "Learning with Unmasked Tokens Drives Stronger Vision Learners"
☆14Dec 1, 2024Updated last year
sihyun-yu / REPA
View on GitHub
[ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
☆1,683Mar 16, 2025Updated last year
lucidrains / rotary-embedding-torch
View on GitHub
Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch
☆818Jun 20, 2026Updated last month
bytedance / 1d-tokenizer
View on GitHub
This repo contains the code for 1D tokenizer and generator
☆1,168Mar 20, 2025Updated last year
Stanford-AIMI / LieRE
View on GitHub
[ICML-2025] We introduce Lie group Relative position Encodings (LieRE) that goes beyond RoPE in supporting n-dimensional inputs.
☆14Aug 8, 2025Updated 11 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
naver-ai / seit
View on GitHub
[ECCV2024][ICCV2023] Official PyTorch implementation of SeiT++ and SeiT
☆56Aug 12, 2024Updated last year
naver-ai / hype
View on GitHub
[ECCV 2024] Official PyTorch implementation of "HYPE: Hyperbolic Entailment Filtering for Underspecified Images and Texts"
☆20Nov 22, 2024Updated last year
naver-ai / prolip
View on GitHub
☆58Aug 16, 2025Updated 11 months ago
sail-sg / MDT
View on GitHub
Masked Diffusion Transformer is the SOTA for image synthesis. (ICCV 2023)
☆596Apr 23, 2024Updated 2 years ago
whlzy / FiT
View on GitHub
[ICML 2024 Spotlight] FiT: Flexible Vision Transformer for Diffusion Model
☆434Nov 10, 2024Updated last year
FoundationVision / LlamaGen
View on GitHub
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
☆1,960Aug 15, 2024Updated last year
LTH14 / mar
View on GitHub
PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838
☆1,944Feb 20, 2026Updated 5 months ago
Meituan-AutoML / VisionLLaMA
View on GitHub
VisionLLaMA: A Unified LLaMA Backbone for Vision Tasks
☆392Jul 9, 2024Updated 2 years ago
facebookresearch / DiT
View on GitHub
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
☆8,693May 31, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
naver-ai / model-stock
View on GitHub
Model Stock: All we need is just a few fine-tuned models
☆129Aug 9, 2025Updated 11 months ago
google-research / big_vision
View on GitHub
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
☆3,502May 19, 2025Updated last year
apple / ml-aim
View on GitHub
This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.
☆1,425Aug 4, 2025Updated 11 months ago
naver-ai / rdnet
View on GitHub
[ECCV2024] Official implementation of paper, "DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs".
☆155Aug 8, 2024Updated last year
naver-ai / muco
View on GitHub
Official Pytorch implementation of MuCo: Multi-turn Contrastive Learning for Multimodal Embedding Model (CVPR 2026)
☆15Apr 16, 2026Updated 3 months ago
rwightman / imagenet-12k
View on GitHub
ImageNet-12k subset of ImageNet-21k (fall11)
☆23Jun 13, 2023Updated 3 years ago
willisma / SiT
View on GitHub
Official PyTorch Implementation of "SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers"
☆1,193Dec 22, 2025Updated 7 months ago
facebookresearch / perception_models
View on GitHub
State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!
☆2,330Apr 13, 2026Updated 3 months ago
baaivision / EVE
View on GitHub
EVE Series: Encoder-Free Vision-Language Models from BAAI
☆376Jul 24, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
valeoai / Halton-MaskGIT
View on GitHub
[ICLR2025] Halton Scheduler for Masked Generative Image Transformer
☆286Oct 28, 2025Updated 9 months ago
NVIDIA / Cosmos-Tokenizer
View on GitHub
A suite of image and video neural tokenizers
☆1,732Feb 11, 2025Updated last year
NVlabs / RADIO
View on GitHub
Official repository for "AM-RADIO: Reduce All Domains Into One"
☆1,906May 29, 2026Updated 2 months ago
hustvl / LightningDiT
View on GitHub
[CVPR 2025 Oral] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models
☆1,512Dec 16, 2025Updated 7 months ago
hustvl / DiG
View on GitHub
[CVPR 2025] DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention
☆185Mar 1, 2025Updated last year
buoyancy99 / diffusion-forcing
View on GitHub
code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"
☆1,280Jul 6, 2026Updated 3 weeks ago
CircleRadon / TokenPacker
View on GitHub
The code for "TokenPacker: Efficient Visual Projector for Multimodal LLM", IJCV2025
☆280May 26, 2025Updated last year
TencentARC / SEED-Voken
View on GitHub
SEED-Voken: A Series of Powerful Visual Tokenizers
☆1,020Nov 25, 2025Updated 8 months ago
mit-han-lab / efficientvit
View on GitHub
Efficient vision foundation models for high-resolution generation and perception.
☆3,338Sep 5, 2025Updated 10 months ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
SHI-Labs / NATTEN
View on GitHub
Fast Multi-dimensional Sparse Attention
☆779Updated this week
FoundationVision / VAR
View on GitHub
[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Mod…
☆8,714Nov 10, 2025Updated 8 months ago
andrehuang / loftup
View on GitHub
[ICCV'25 oral] Official Code for "LoftUp: Learning a Coordinate-Based Feature Upsampler for Vision Foundation Models"
☆261Jan 13, 2026Updated 6 months ago
ChuanyangZheng / L2ViT
View on GitHub
Official PyTorch implementation of The Linear Attention Resurrection in Vision Transformer
☆15Sep 7, 2024Updated last year
bfshi / scaling_on_scales
View on GitHub
When do we not need larger vision models?
☆420Feb 8, 2025Updated last year
beichenzbc / Long-CLIP
View on GitHub
[ECCV 2024] official code for "Long-CLIP: Unlocking the Long-Text Capability of CLIP"
☆901Aug 13, 2024Updated last year
bytetriper / RAE
View on GitHub
Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"
☆1,981Feb 25, 2026Updated 5 months ago