☆38Feb 6, 2025Updated last year
Alternatives and similar repositories for CoordTok
Users that are interested in CoordTok are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Visual Representation Learning with Stochastic Frame Prediction (ICML 2024)☆26Nov 27, 2024Updated last year
- ElasticTok: Adaptive Tokenization for Image and Video☆91Nov 4, 2024Updated last year
- Efficiently send large arrays across machines☆17Jul 24, 2024Updated last year
- ☆18Jun 8, 2023Updated 2 years ago
- GreenAug: Green Screen Augmentation Enables Scene Generalisation in Robotic Manipulation☆13Sep 10, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Adapting Self-Supervised Representations as a Latent Space for Efficient Generation☆41Oct 17, 2025Updated 6 months ago
- [CVPR 2023] Spatial-then-Temporal Self-Supervised Learning for Video Correspondence☆11Jul 5, 2023Updated 2 years ago
- ☆10Sep 25, 2024Updated last year
- Official Pytorch implementation for LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior (ICLR 2025 Oral).☆102Feb 11, 2025Updated last year
- CVPR2022:Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical Consistency☆18Aug 10, 2022Updated 3 years ago
- [RA-L 2024] Novel action spaces leveraging redundancy in 7 DoF arms enable efficient & precise learning in robotic manipulation☆21Jun 6, 2024Updated last year
- Coarse-to-fine Q-Network☆59Aug 6, 2024Updated last year
- [ICLR'24] Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition☆54May 14, 2024Updated last year
- Subtask-Aware Visual Reward Learning from Segmented Demonstrations (ICLR 2025 accepted)☆18Apr 11, 2025Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Official implementation of paper ReTaKe: Reducing Temporal and Knowledge Redundancy for Long Video Understanding☆40Mar 16, 2025Updated last year
- RE3: State Entropy Maximization with Random Encoders for Efficient Exploration☆69Jul 29, 2021Updated 4 years ago
- (ICLR 2025) Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation☆16Apr 29, 2025Updated 11 months ago
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆33May 1, 2025Updated 11 months ago
- ☆61Apr 16, 2023Updated 3 years ago
- The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]☆21Feb 27, 2025Updated last year
- ☆24Dec 4, 2020Updated 5 years ago
- A Video Tokenizer Evaluation Dataset☆153Jan 13, 2025Updated last year
- Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)☆86Feb 27, 2025Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Implementation of SmoothCache, a project aimed at speeding-up Diffusion Transformer (DiT) based GenAI models with error-guided caching.☆48Jul 17, 2025Updated 9 months ago
- Learning Large-scale Neural Fields via Context Pruned Meta-Learning (NeurIPS 2023)☆28Sep 24, 2023Updated 2 years ago
- Adapting LLaMA Decoder to Vision Transformer☆30May 20, 2024Updated last year
- [ICCV 2025] Official code for Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation☆59Sep 12, 2025Updated 7 months ago
- [ICML 2025] Official Implementation of Hierarchical Masked Autoregressive Models with Low-Resolution Token Pivots☆29May 28, 2025Updated 10 months ago
- ☆14Apr 25, 2025Updated 11 months ago
- [EMNLP 2024] Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality☆22Oct 8, 2024Updated last year
- [TACL/EMNLP'24] Do Vision and Language Models Share Concepts? A Vector Space Alignment Study☆16Nov 22, 2024Updated last year
- Official implementation of "Exploring Temporally-Aware Features for Point Tracking" (CVPR 2025)☆103Apr 5, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- This repo contains the code for 1D tokenizer and generator☆1,143Mar 20, 2025Updated last year
- [CVPR 2024] On the Content Bias in Fréchet Video Distance☆146Sep 28, 2024Updated last year
- Jump to better conclusions: SCAN both left and right☆11Jan 24, 2019Updated 7 years ago
- Paper dataset for "Factored Verification: Detecting and Reducing Hallucination in Summaries of Academic Papers"☆13Oct 20, 2024Updated last year
- SketchINR: A First Look into Sketches as Implicit Neural Representations [CVPR 2024]☆12Aug 19, 2024Updated last year
- ☆24May 23, 2025Updated 10 months ago
- Official implementation of ECCV24 paper: POA☆24Aug 8, 2024Updated last year