☆34May 14, 2025Updated 10 months ago
Alternatives and similar repositories for vitok
Users that are interested in vitok are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [NeurIPS 2025] Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations☆202Sep 18, 2025Updated 6 months ago
- An implementation of several unsupervised object discovery models (Slot Attention, SLATE, GNM) in PyTorch with pre-trained models.☆15May 26, 2025Updated 10 months ago
- A big_vision inspired repo that implements a generic Auto-Encoder class capable in representation learning and generative modeling.☆34Jun 26, 2024Updated last year
- This repo contains the code for the paper "Object-cropping for SSL".☆18Feb 14, 2023Updated 3 years ago
- [ICML'25] EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling.☆176Mar 18, 2026Updated 3 weeks ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Single-pass Adaptive Image Tokenization for Minimum Program Search | What's the Kolmogorov Complexity of an Image?☆42Jul 26, 2025Updated 8 months ago
- Official repo for From Intention to Execution: Probing the Generalization Boundaries of Vision-Language-Action Models☆31Nov 2, 2025Updated 5 months ago
- ☆23Jun 18, 2024Updated last year
- Official implementation of SimFlow☆31Dec 16, 2025Updated 3 months ago
- (CVPR 2025) A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning☆24Mar 11, 2025Updated last year
- A basic pure pytorch implementation of flash attention☆16Oct 28, 2024Updated last year
- [NeurIPS 2022] code for "Visual Concepts Tokenization"☆23Oct 10, 2022Updated 3 years ago
- ☆62Oct 29, 2022Updated 3 years ago
- [ICLR'24] Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition☆54May 14, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Official PyTorch Implementation of "Latent Denoising Makes Good Visual Tokenizers"☆181Feb 24, 2026Updated last month
- Pytorch implementation of Twelve Labs' Video Foundation Model evaluation framework & open embeddings☆33Aug 23, 2024Updated last year
- ACCO: An optimization algorithm for sharded distributed LLM training.☆13May 22, 2025Updated 10 months ago
- Official PyTorch Implementation of "Diffusion Autoencoders are Scalable Image Tokenizers"☆166Jan 31, 2025Updated last year
- Code for MetaMorph Multimodal Understanding and Generation via Instruction Tuning☆236Jan 22, 2026Updated 2 months ago
- ☆20Nov 23, 2022Updated 3 years ago
- [ICLR 26] The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".☆17Feb 9, 2026Updated 2 months ago
- Code release for "Generative Modeling of Weights: Generalization or Memorization?"☆19Updated this week
- ☆20Mar 25, 2025Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Official code of "Optimizing 4D Gaussians for Dynamic Scene Video from Single Landscape Images (ICLR 2025)"☆27Mar 4, 2026Updated last month
- ☆19Nov 5, 2025Updated 5 months ago
- [ICML-2025] We introduce Lie group Relative position Encodings (LieRE) that goes beyond RoPE in supporting n-dimensional inputs.☆30Aug 13, 2025Updated 7 months ago
- Measuring the Signal to Noise Ratio in Language Model Evaluation☆29Aug 19, 2025Updated 7 months ago
- A comprehensive codebase for training and finetuning Image <> Latent models.☆50Mar 1, 2025Updated last year
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Jun 21, 2023Updated 2 years ago
- JAX Scalify: end-to-end scaled arithmetics☆18Oct 30, 2024Updated last year
- Code release for "MORE: Multi-mOdal REtrieval Augmented Generative Commonsense Reasoning"☆11Oct 11, 2024Updated last year
- ☆53Jan 18, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆132Dec 3, 2024Updated last year
- Benchmarking Multi-Image Understanding in Vision and Language Models☆12Jul 29, 2024Updated last year
- Extracting Relationships by Multi-Domain Matching☆11Mar 21, 2019Updated 7 years ago
- 2nd place solution of ECCV 2020 workshop VIPriors Image Classification Challenge, https://arxiv.org/abs/2008.00261☆13Aug 22, 2021Updated 4 years ago
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆139Apr 3, 2026Updated last week
- (ICCV 2025) "Principal Components" Enable A New Language of Images☆82Jul 28, 2025Updated 8 months ago
- ☆28Dec 21, 2023Updated 2 years ago