KempnerInstitute / raptorLinks
Block-Recurrent Dynamics in ViTs π¦
β24Updated last month
Alternatives and similar repositories for raptor
Users that are interested in raptor are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2025 Oral] Official Code for Exploring Diffusion Transformer Designs via Graftingβ70Updated 3 weeks ago
- [ICML 2024] Compositional Image Decomposition with Diffusion Modelsβ53Updated last year
- β21Updated last year
- Official repo for UAEβ161Updated last month
- Code release for paper "Test-Time Training Done Right"β367Updated 3 weeks ago
- Official PyTorch implementation of FlowMo.β110Updated 9 months ago
- [ICLR 2025] Official implementation and benchmark evaluation repository of <PhysBench: Benchmarking and Enhancing Vision-Language Models β¦β83Updated 2 weeks ago
- β162Updated last year
- Official repository of PhysMaster: Mastering Physical Representation for Video Generation via Reinforcement Learningβ58Updated 3 months ago
- ElasticTok: Adaptive Tokenization for Image and Videoβ88Updated last year
- Official PyTorch Implementation of "Flow Map Distillation Without Data"β115Updated 2 months ago
- β38Updated 11 months ago
- [CVPR 2025] Science-T2I: Addressing Scientific Illusions in Image Synthesisβ62Updated 9 months ago
- PyTorch Implementation of Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Modelβ27Updated last year
- Scaling Text-to-Image Diffusion Transformers with Representation Autoencodersβ188Updated last week
- Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)β86Updated 11 months ago
- Official PyTorch Implementation of "Latent Denoising Makes Good Visual Tokenizers"β172Updated last month
- Official Repo of From Masks to Worlds: A Hitchhikerβs Guide to World Models.β71Updated 3 months ago
- Code for ICML 2025 Paper "Highly Compressed Tokenizer Can Generate Without Training"β201Updated 7 months ago
- [ICLR 2026] π» Uniform Discrete Diffusion with Metric Path for Video Generationβ94Updated 2 weeks ago
- [ICCV 2025] Official code for Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulationβ55Updated 4 months ago
- [arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generationβ95Updated 11 months ago
- the official repo for "D-AR: Diffusion via Autoregressive Models"β132Updated this week
- Thinking with Videos from Open-Source Priors. We reproduce chain-of-frames visual reasoning by fine-tuning open-source video models. Giveβ¦β206Updated 3 months ago
- Official PyTorch Implementation for Dual-Process Image Generation, ICCV 2025β122Updated 5 months ago
- [NeurIPS 2025] Source codes for the paper "MindJourney: Test-Time Scaling with World Models for Spatial Reasoning"β125Updated 2 months ago
- [ICCV 2025 Workshop Outstanding Paper Award] VChain: Chain-of-Visual-Thought for Reasoning in Video Generationβ115Updated 3 months ago
- Official Implementation of iMF https://arxiv.org/abs/2512.02012β101Updated this week
- A Video Tokenizer Evaluation Datasetβ149Updated last year
- A Large-scale Video Action Datasetβ376Updated 2 weeks ago