☆38Feb 6, 2025Updated last year
Alternatives and similar repositories for CoordTok
Users that are interested in CoordTok are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Visual Representation Learning with Stochastic Frame Prediction (ICML 2024)☆27Nov 27, 2024Updated last year
- ElasticTok: Adaptive Tokenization for Image and Video☆92Nov 4, 2024Updated last year
- ☆18Jun 8, 2023Updated 3 years ago
- GreenAug: Green Screen Augmentation Enables Scene Generalisation in Robotic Manipulation☆13Sep 10, 2024Updated last year
- [ICLR 2026] Adapting Self-Supervised Representations as a Latent Space for Efficient Generation☆60Apr 24, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆18May 30, 2023Updated 3 years ago
- [CVPR 2023] Spatial-then-Temporal Self-Supervised Learning for Video Correspondence☆11Jul 5, 2023Updated 2 years ago
- ☆28Aug 9, 2025Updated 10 months ago
- Official Pytorch implementation for LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior (ICLR 2025 Oral).☆105Feb 11, 2025Updated last year
- CVPR2022:Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical Consistency☆18Aug 10, 2022Updated 3 years ago
- [RA-L 2024] Novel action spaces leveraging redundancy in 7 DoF arms enable efficient & precise learning in robotic manipulation☆23Jun 6, 2024Updated 2 years ago
- Coarse-to-fine Q-Network☆59Aug 6, 2024Updated last year
- PyTorch code accompanying the paper "Imitating Graph-Based Planning with Goal-Conditioned Policies" (ICLR 2023).☆21Mar 4, 2023Updated 3 years ago
- [ICLR'24] Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition☆55May 14, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Subtask-Aware Visual Reward Learning from Segmented Demonstrations (ICLR 2025 accepted)☆19Apr 11, 2025Updated last year
- Official implementation of paper ReTaKe: Reducing Temporal and Knowledge Redundancy for Long Video Understanding☆40Mar 16, 2025Updated last year
- Modality-Agnostic Self-Supervised Learning with Meta-Learned Masked Auto-Encoder (NeurIPS 2023)☆10Jun 5, 2024Updated 2 years ago
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆33May 1, 2025Updated last year
- [CVPR2025] Controllable Human Image Generation with Personalized Multi-Garments☆58Mar 23, 2025Updated last year
- The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]☆21Feb 27, 2025Updated last year
- ☆24Dec 4, 2020Updated 5 years ago
- TARS: MinMax Token-Adaptive Preference Strategy for Hallucination Reduction in MLLMs☆24Sep 21, 2025Updated 8 months ago
- A Video Tokenizer Evaluation Dataset☆158Jan 13, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)☆87Feb 27, 2025Updated last year
- Implementation of SmoothCache, a project aimed at speeding-up Diffusion Transformer (DiT) based GenAI models with error-guided caching.☆49Jul 17, 2025Updated 11 months ago
- Learning Large-scale Neural Fields via Context Pruned Meta-Learning (NeurIPS 2023)☆28Sep 24, 2023Updated 2 years ago
- Adapting LLaMA Decoder to Vision Transformer☆30May 20, 2024Updated 2 years ago
- [ICCV 2025] Official code for Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation☆62Sep 12, 2025Updated 9 months ago
- [ICML 2025] Official Implementation of Hierarchical Masked Autoregressive Models with Low-Resolution Token Pivots☆31May 28, 2025Updated last year
- ☆15Apr 25, 2025Updated last year
- [EMNLP 2024] Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality☆22Oct 8, 2024Updated last year
- [TACL/EMNLP'24] Do Vision and Language Models Share Concepts? A Vector Space Alignment Study☆16Nov 22, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- CatMAE☆15Dec 13, 2023Updated 2 years ago
- Official implementation of "Exploring Temporally-Aware Features for Point Tracking" (CVPR 2025)☆102Apr 5, 2025Updated last year
- ☆132Feb 22, 2025Updated last year
- This repo contains the code for 1D tokenizer and generator☆1,159Mar 20, 2025Updated last year
- Jump to better conclusions: SCAN both left and right☆11Jan 24, 2019Updated 7 years ago
- FDFO: Finite Difference Flow Optimization☆110Apr 27, 2026Updated last month
- Paper dataset for "Factored Verification: Detecting and Reducing Hallucination in Summaries of Academic Papers"☆13Oct 20, 2024Updated last year