huiwon-jang / CoordTokView external linksLinks
☆38Feb 6, 2025Updated last year
Alternatives and similar repositories for CoordTok
Users that are interested in CoordTok are comparing it to the libraries listed below
Sorting:
- Visual Representation Learning with Stochastic Frame Prediction (ICML 2024)☆26Nov 27, 2024Updated last year
- TARS: MinMax Token-Adaptive Preference Strategy for Hallucination Reduction in MLLMs☆23Sep 21, 2025Updated 4 months ago
- ElasticTok: Adaptive Tokenization for Image and Video☆88Nov 4, 2024Updated last year
- ☆14Apr 25, 2025Updated 9 months ago
- GreenAug: Green Screen Augmentation Enables Scene Generalisation in Robotic Manipulation☆12Sep 10, 2024Updated last year
- Subtask-Aware Visual Reward Learning from Segmented Demonstrations (ICLR 2025 accepted)☆18Apr 11, 2025Updated 10 months ago
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆33May 1, 2025Updated 9 months ago
- Adapting LLaMA Decoder to Vision Transformer☆30May 20, 2024Updated last year
- Jaehyung Kim et al's ACL 2023 paper on "infoVerse: A Universal Framework for Dataset Characterization with Multidimensional Meta-informat…☆16Jun 28, 2023Updated 2 years ago
- Efficiently send large arrays across machines☆17Jul 24, 2024Updated last year
- ☆24May 23, 2025Updated 8 months ago
- [TACL] Do Vision and Language Models Share Concepts? A Vector Space Alignment Study☆16Nov 22, 2024Updated last year
- Official Pytorch implementation for LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior (ICLR 2025 Oral).☆98Feb 11, 2025Updated last year
- ☆41Jun 9, 2025Updated 8 months ago
- [EMNLP 2024] Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality☆21Oct 8, 2024Updated last year
- ☆18Jun 8, 2023Updated 2 years ago
- ☆19Jun 29, 2025Updated 7 months ago
- ☆16Apr 30, 2024Updated last year
- Official implementation of paper ReTaKe: Reducing Temporal and Knowledge Redundancy for Long Video Understanding☆39Mar 16, 2025Updated 10 months ago
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆29Oct 9, 2025Updated 4 months ago
- [ICLR 2026] Official PyTorch implementation for "ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding"☆55Dec 26, 2025Updated last month
- CVPR2022:Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical Consistency☆18Aug 10, 2022Updated 3 years ago
- [RA-L 2024] Novel action spaces leveraging redundancy in 7 DoF arms enable efficient & precise learning in robotic manipulation☆21Jun 6, 2024Updated last year
- Implementation of SmoothCache, a project aimed at speeding-up Diffusion Transformer (DiT) based GenAI models with error-guided caching.☆47Jul 17, 2025Updated 6 months ago
- Adapting Self-Supervised Representations as a Latent Space for Efficient Generation☆39Oct 17, 2025Updated 3 months ago
- [ICCV 2025] Official code for Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation☆57Sep 12, 2025Updated 5 months ago
- ☆18May 30, 2023Updated 2 years ago
- The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]☆21Feb 27, 2025Updated 11 months ago
- Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)☆86Feb 27, 2025Updated 11 months ago
- Official implementation of ECCV24 paper: POA☆24Aug 8, 2024Updated last year
- Official Repository of LatentSeek☆76Jun 6, 2025Updated 8 months ago
- PyTorch code accompanying the paper "Imitating Graph-Based Planning with Goal-Conditioned Policies" (ICLR 2023).☆20Mar 4, 2023Updated 2 years ago
- [ICML 2025] Official Implementation of Hierarchical Masked Autoregressive Models with Low-Resolution Token Pivots☆30May 28, 2025Updated 8 months ago
- Coarse-to-fine Q-Network☆58Aug 6, 2024Updated last year
- An Open-source Factuality Evaluation Demo for LLMs☆32Aug 10, 2025Updated 6 months ago
- ☆45May 27, 2025Updated 8 months ago
- MuMA-ToM: Multi-modal Multi-Agent Theory of Mind☆38Jan 23, 2025Updated last year
- ☆60Apr 16, 2023Updated 2 years ago
- A Text2SQL benchmark for evaluation of Large Language Models☆41Feb 8, 2026Updated last week