☆34May 14, 2025Updated 10 months ago
Alternatives and similar repositories for vitok
Users that are interested in vitok are comparing it to the libraries listed below
Sorting:
- ☆13Nov 1, 2023Updated 2 years ago
- An implementation of several unsupervised object discovery models (Slot Attention, SLATE, GNM) in PyTorch with pre-trained models.☆14May 26, 2025Updated 9 months ago
- [ICML'25] EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling.☆175Updated this week
- Official repo for From Intention to Execution: Probing the Generalization Boundaries of Vision-Language-Action Models☆32Nov 2, 2025Updated 4 months ago
- Single-pass Adaptive Image Tokenization for Minimum Program Search | What's the Kolmogorov Complexity of an Image?☆42Jul 26, 2025Updated 7 months ago
- ☆23Jun 18, 2024Updated last year
- A basic pure pytorch implementation of flash attention☆16Oct 28, 2024Updated last year
- ☆40Jun 6, 2025Updated 9 months ago
- [ICLR'24] Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition☆55May 14, 2024Updated last year
- This repo contains evaluation code for the paper "AV-Odyssey: Can Your Multimodal LLMs Really Understand Audio-Visual Information?"☆31Dec 23, 2024Updated last year
- Official PyTorch Implementation of "Latent Denoising Makes Good Visual Tokenizers"☆180Feb 24, 2026Updated 3 weeks ago
- Experimental GPU language with meta-programming☆27Sep 6, 2024Updated last year
- Pytorch implementation of Twelve Labs' Video Foundation Model evaluation framework & open embeddings☆32Aug 23, 2024Updated last year
- Official PyTorch Implementation of "Diffusion Autoencoders are Scalable Image Tokenizers"☆166Jan 31, 2025Updated last year
- Code for MetaMorph Multimodal Understanding and Generation via Instruction Tuning☆235Jan 22, 2026Updated last month
- PyTorch implementation of RWKV blocks☆32Jul 22, 2025Updated 7 months ago
- ☆20Nov 23, 2022Updated 3 years ago
- [ICLR 26] The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".☆17Feb 9, 2026Updated last month
- Slot-TTA shows that test-time adaptation using slot-centric models can improve image segmentation on out-of-distribution examples.☆26Jun 20, 2023Updated 2 years ago
- ☆20Mar 25, 2025Updated 11 months ago
- [NeurIPS 2025 Spotlight] A Unified Tokenizer for Visual Generation and Understanding☆518Nov 14, 2025Updated 4 months ago
- [ICML-2025] We introduce Lie group Relative position Encodings (LieRE) that goes beyond RoPE in supporting n-dimensional inputs.☆30Aug 13, 2025Updated 7 months ago
- Measuring the Signal to Noise Ratio in Language Model Evaluation☆29Aug 19, 2025Updated 7 months ago
- A comprehensive codebase for training and finetuning Image <> Latent models.☆50Mar 1, 2025Updated last year
- Official Pytorch implementation for LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior (ICLR 2025 Oral).☆100Feb 11, 2025Updated last year
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Jun 21, 2023Updated 2 years ago
- new optimizer☆20Aug 4, 2024Updated last year
- ☆32Jul 29, 2024Updated last year
- supporting pytorch FSDP for optimizers☆84Dec 8, 2024Updated last year
- Code release for "MORE: Multi-mOdal REtrieval Augmented Generative Commonsense Reasoning"☆11Oct 11, 2024Updated last year
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆133Dec 3, 2024Updated last year
- ☆17Mar 2, 2023Updated 3 years ago
- 2nd place solution of ECCV 2020 workshop VIPriors Image Classification Challenge, https://arxiv.org/abs/2008.00261☆13Aug 22, 2021Updated 4 years ago
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆134Feb 21, 2026Updated last month
- ☆28Feb 15, 2026Updated last month
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆92Oct 30, 2024Updated last year
- (ICCV 2025) "Principal Components" Enable A New Language of Images☆80Jul 28, 2025Updated 7 months ago
- Official repository for "Solving Video Inverse Problems Using Image Diffusion Models"☆11Mar 7, 2026Updated 2 weeks ago
- Pixel-Space Generative Models☆305May 11, 2025Updated 10 months ago