kyegomez / ViTAR
Implementation of ViTaR: ViTAR: Vision Transformer with Any Resolution in PyTorch
☆22Updated last week
Related projects: ⓘ
- Introduce Mamba2 to Vision.☆70Updated 3 weeks ago
- ☆19Updated 11 months ago
- ☆11Updated last week
- My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing o…☆36Updated 9 months ago
- StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing☆50Updated last month
- Awsome works based on SSM and Mamba☆14Updated 5 months ago
- This repository contains the pytorch code for our ISBI 2024 paper "ConvLoRA and AdaBN Based Domain Adaptation via Self-Training".☆45Updated 6 months ago
- [CVPR'24] Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities☆85Updated 6 months ago
- Open source community's implementation of the model from "LANGUAGE MODEL BEATS DIFFUSION — TOKENIZER IS KEY TO VISUAL GENERATION"☆15Updated last week
- Official Pytorch Implementation of Self-emerging Token Labeling☆30Updated 5 months ago
- Second Generation of the MAMBA Software☆27Updated last year
- ☆24Updated last month
- The official code of "U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers"☆64Updated 3 months ago
- Official implementation of paper titled "GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model"☆56Updated 2 months ago
- An efficient pytorch implementation of selective scan in one file, works with both cpu and gpu, with corresponding mathematical derivatio…☆64Updated 6 months ago
- [ICML 2024] This repository includes the official implementation of our paper "Rejuvenating image-GPT as Strong Visual Representation Lea…☆96Updated 4 months ago
- [CVPR 2024] The official pytorch implementation of "A General and Efficient Training for Transformer via Token Expansion".☆36Updated 4 months ago
- ☆32Updated 8 months ago
- ☆41Updated 5 months ago
- Transformer-Mamba Diffusion Models☆78Updated 2 months ago
- The official implementation of DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis☆147Updated 2 months ago
- This repository is the official implementation of our Autoregressive Pretraining with Mamba in Vision☆53Updated 3 months ago
- Scaling RWKV-Like Architectures for Diffusion Models☆110Updated 5 months ago
- [ECCV2024] ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation☆45Updated 2 weeks ago
- ☆21Updated last week
- Official repository of paper "Subobject-level Image Tokenization"☆58Updated 4 months ago
- Project for "LaSagnA: Language-based Segmentation Assistant for Complex Queries".☆43Updated 4 months ago
- List of papers related to State Space Models (Mamba) in Vision.☆37Updated 2 months ago
- Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'☆44Updated 3 weeks ago
- Collect papers about Mamba (a selective state space model).☆13Updated last month