☆38Feb 6, 2025Updated last year
Alternatives and similar repositories for CoordTok
Users that are interested in CoordTok are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Visual Representation Learning with Stochastic Frame Prediction (ICML 2024)☆26Nov 27, 2024Updated last year
- ElasticTok: Adaptive Tokenization for Image and Video☆91Nov 4, 2024Updated last year
- Efficiently send large arrays across machines☆17Jul 24, 2024Updated last year
- ☆18Jun 8, 2023Updated 2 years ago
- Jaehyung Kim et al's ACL 2023 paper on "infoVerse: A Universal Framework for Dataset Characterization with Multidimensional Meta-informat…☆16Jun 28, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- GreenAug: Green Screen Augmentation Enables Scene Generalisation in Robotic Manipulation☆13Sep 10, 2024Updated last year
- [ICLR 2026] Adapting Self-Supervised Representations as a Latent Space for Efficient Generation☆50Apr 24, 2026Updated 2 weeks ago
- ☆18May 30, 2023Updated 2 years ago
- [CVPR 2023] Spatial-then-Temporal Self-Supervised Learning for Video Correspondence☆11Jul 5, 2023Updated 2 years ago
- ☆10Sep 25, 2024Updated last year
- ☆24Aug 9, 2025Updated 9 months ago
- Official Pytorch implementation for LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior (ICLR 2025 Oral).☆103Feb 11, 2025Updated last year
- CVPR2022:Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical Consistency☆18Aug 10, 2022Updated 3 years ago
- [RA-L 2024] Novel action spaces leveraging redundancy in 7 DoF arms enable efficient & precise learning in robotic manipulation☆21Jun 6, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Coarse-to-fine Q-Network☆59Aug 6, 2024Updated last year
- PyTorch code accompanying the paper "Imitating Graph-Based Planning with Goal-Conditioned Policies" (ICLR 2023).☆21Mar 4, 2023Updated 3 years ago
- [ICLR'24] Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition☆54May 14, 2024Updated last year
- Subtask-Aware Visual Reward Learning from Segmented Demonstrations (ICLR 2025 accepted)☆19Apr 11, 2025Updated last year
- Official implementation of paper ReTaKe: Reducing Temporal and Knowledge Redundancy for Long Video Understanding☆40Mar 16, 2025Updated last year
- Modality-Agnostic Self-Supervised Learning with Meta-Learned Masked Auto-Encoder (NeurIPS 2023)☆10Jun 5, 2024Updated last year
- RE3: State Entropy Maximization with Random Encoders for Efficient Exploration☆69Jul 29, 2021Updated 4 years ago
- (ICLR 2025) Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation☆16Apr 29, 2025Updated last year
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆33May 1, 2025Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆61Apr 16, 2023Updated 3 years ago
- [CVPR2025] Controllable Human Image Generation with Personalized Multi-Garments☆58Mar 23, 2025Updated last year
- The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]☆21Feb 27, 2025Updated last year
- ☆24Dec 4, 2020Updated 5 years ago
- TARS: MinMax Token-Adaptive Preference Strategy for Hallucination Reduction in MLLMs☆24Sep 21, 2025Updated 7 months ago
- A Video Tokenizer Evaluation Dataset☆156Jan 13, 2025Updated last year
- Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)☆86Feb 27, 2025Updated last year
- Implementation of SmoothCache, a project aimed at speeding-up Diffusion Transformer (DiT) based GenAI models with error-guided caching.☆48Jul 17, 2025Updated 9 months ago
- Learning Large-scale Neural Fields via Context Pruned Meta-Learning (NeurIPS 2023)☆28Sep 24, 2023Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Adapting LLaMA Decoder to Vision Transformer☆30May 20, 2024Updated last year
- [ICCV 2025] Official code for Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation☆60Sep 12, 2025Updated 7 months ago
- FDFO: Finite Difference Flow Optimization☆97Apr 27, 2026Updated last week
- ☆14Apr 25, 2025Updated last year
- [EMNLP 2024] Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality☆22Oct 8, 2024Updated last year
- [TACL/EMNLP'24] Do Vision and Language Models Share Concepts? A Vector Space Alignment Study☆16Nov 22, 2024Updated last year
- CatMAE☆15Dec 13, 2023Updated 2 years ago