☆38Feb 6, 2025Updated last year
Alternatives and similar repositories for CoordTok
Users that are interested in CoordTok are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Visual Representation Learning with Stochastic Frame Prediction (ICML 2024)☆26Nov 27, 2024Updated last year
- ElasticTok: Adaptive Tokenization for Image and Video☆90Nov 4, 2024Updated last year
- ☆18Jun 8, 2023Updated 2 years ago
- Jaehyung Kim et al's ACL 2023 paper on "infoVerse: A Universal Framework for Dataset Characterization with Multidimensional Meta-informat…☆16Jun 28, 2023Updated 2 years ago
- Adapting Self-Supervised Representations as a Latent Space for Efficient Generation☆40Oct 17, 2025Updated 5 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- GreenAug: Green Screen Augmentation Enables Scene Generalisation in Robotic Manipulation☆13Sep 10, 2024Updated last year
- ☆18May 30, 2023Updated 2 years ago
- ☆10Sep 25, 2024Updated last year
- Official Pytorch implementation for LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior (ICLR 2025 Oral).☆100Feb 11, 2025Updated last year
- CVPR2022:Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical Consistency☆18Aug 10, 2022Updated 3 years ago
- Coarse-to-fine Q-Network☆59Aug 6, 2024Updated last year
- PyTorch code accompanying the paper "Imitating Graph-Based Planning with Goal-Conditioned Policies" (ICLR 2023).☆21Mar 4, 2023Updated 3 years ago
- [ICLR'24] Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition☆55May 14, 2024Updated last year
- Subtask-Aware Visual Reward Learning from Segmented Demonstrations (ICLR 2025 accepted)☆18Apr 11, 2025Updated 11 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Modality-Agnostic Self-Supervised Learning with Meta-Learned Masked Auto-Encoder (NeurIPS 2023)☆10Jun 5, 2024Updated last year
- RE3: State Entropy Maximization with Random Encoders for Efficient Exploration☆69Jul 29, 2021Updated 4 years ago
- (ICLR 2025) Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation☆15Apr 29, 2025Updated 11 months ago
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆33May 1, 2025Updated 10 months ago
- ☆60Apr 16, 2023Updated 2 years ago
- [CVPR2025] Controllable Human Image Generation with Personalized Multi-Garments☆57Mar 23, 2025Updated last year
- The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]☆21Feb 27, 2025Updated last year
- A Video Tokenizer Evaluation Dataset☆152Jan 13, 2025Updated last year
- Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)☆86Feb 27, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Implementation of SmoothCache, a project aimed at speeding-up Diffusion Transformer (DiT) based GenAI models with error-guided caching.☆48Jul 17, 2025Updated 8 months ago
- Learning Large-scale Neural Fields via Context Pruned Meta-Learning (NeurIPS 2023)☆28Sep 24, 2023Updated 2 years ago
- Adapting LLaMA Decoder to Vision Transformer☆30May 20, 2024Updated last year
- [ICML 2025] Official Implementation of Hierarchical Masked Autoregressive Models with Low-Resolution Token Pivots☆30May 28, 2025Updated 10 months ago
- [ICCV 2025] Official code for Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation☆57Sep 12, 2025Updated 6 months ago
- ☆14Apr 25, 2025Updated 11 months ago
- [TACL/EMNLP'24] Do Vision and Language Models Share Concepts? A Vector Space Alignment Study☆16Nov 22, 2024Updated last year
- CatMAE☆14Dec 13, 2023Updated 2 years ago
- Official implementation of "Exploring Temporally-Aware Features for Point Tracking" (CVPR 2025)☆106Apr 5, 2025Updated 11 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆20Apr 18, 2024Updated last year
- This repo contains the code for 1D tokenizer and generator☆1,134Mar 20, 2025Updated last year
- Jump to better conclusions: SCAN both left and right☆11Jan 24, 2019Updated 7 years ago
- [CVPR 2024] On the Content Bias in Fréchet Video Distance☆144Sep 28, 2024Updated last year
- Paper dataset for "Factored Verification: Detecting and Reducing Hallucination in Summaries of Academic Papers"☆13Oct 20, 2024Updated last year
- An official implementation of "RankMixup: Ranking-Based Mixup Training for Network Calibration" (ICCV 2023) in PyTorch.☆11Dec 18, 2023Updated 2 years ago
- [NeurIPS 2024] GenRL: Multimodal-foundation world models enable grounding language and video prompts into embodied domains, by turning th…☆86Apr 4, 2025Updated 11 months ago