g-luo / vlm_cross_modal_repsView external linksLinks
Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025
☆33May 1, 2025Updated 9 months ago
Alternatives and similar repositories for vlm_cross_modal_reps
Users that are interested in vlm_cross_modal_reps are comparing it to the libraries listed below
Sorting:
- [ICLR 2025] Weighted-Reward Preference Optimization for Implicit Model Fusion☆14Mar 17, 2025Updated 11 months ago
- A Framework for Evaluating AI Agent Safety in Realistic Environments☆30Oct 2, 2025Updated 4 months ago
- Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment☆64Jul 22, 2025Updated 6 months ago
- [ACL 2025] Analyzing LLMs' Multilingual Knowledge Boundary Cognition Across Languages Through the Lens of Internal Representations☆16Oct 18, 2025Updated 3 months ago
- [ICML'25] "Rethinking Addressing in Language Models via Contextualized Equivariant Positional Encoding" by Jiajun Zhu, Peihao Wang, Ruisi…☆14Jun 6, 2025Updated 8 months ago
- LMM for VQA, tcsvt version☆11Jul 19, 2024Updated last year
- ☆14Apr 25, 2025Updated 9 months ago
- ☆59Mar 3, 2025Updated 11 months ago
- PyTorch implementation of "Sample- and Parameter-Efficient Auto-Regressive Image Models" from CVPR 2025☆14Nov 21, 2025Updated 2 months ago
- ☆21Jul 25, 2025Updated 6 months ago
- ☆24May 23, 2025Updated 8 months ago
- CS194-196 Course Project☆14Feb 20, 2025Updated 11 months ago
- [TACL] Do Vision and Language Models Share Concepts? A Vector Space Alignment Study☆16Nov 22, 2024Updated last year
- Official implementation of paper "VMoBA: Mixture-of-Block Attention for Video Diffusion Models"☆62Jul 1, 2025Updated 7 months ago
- ☆19Jun 29, 2025Updated 7 months ago
- [ACL 2025 Findings] Text2World: Benchmarking Large Language Models for Symbolic World Model Generation☆27Feb 25, 2025Updated 11 months ago
- ☆38Feb 6, 2025Updated last year
- ☆33Jul 9, 2025Updated 7 months ago
- Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"☆17Mar 31, 2025Updated 10 months ago
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆29Oct 9, 2025Updated 4 months ago
- [ICLR 2026] Official PyTorch implementation for "ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding"☆55Dec 26, 2025Updated last month
- Code for the paper "Cottention: Linear Transformers With Cosine Attention"☆20Nov 15, 2025Updated 3 months ago
- ☆16Jul 23, 2024Updated last year
- [EMNLP 2024] Tree of Problems: Improving structured problem solving with compositionality☆19Mar 4, 2025Updated 11 months ago
- ABC: Achieving Better Control of Multimodal Embeddings using VLMs [TMLR2025]☆20Aug 21, 2025Updated 5 months ago
- Code for "Preference Tuning For Toxicity Mitigation Generalizes Across Languages." Paper accepted at Findings of EMNLP 2024☆18Mar 25, 2025Updated 10 months ago
- [ICLR'25] Code for KaSA, an official implementation of "KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models"☆20Jan 16, 2025Updated last year
- Official repository of DialSim☆28Oct 31, 2025Updated 3 months ago
- Experiments to assess SPADE on different LLM pipelines.☆17Apr 7, 2024Updated last year
- ☆28Apr 22, 2025Updated 9 months ago
- ☆25Dec 13, 2024Updated last year
- [ACL 2025 Findings] Official implementation of the paper "Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning".☆20Feb 26, 2025Updated 11 months ago
- The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]☆21Feb 27, 2025Updated 11 months ago
- A benchmark dataset and simple code examples for measuring the perception and reasoning of multi-sensor Vision Language models.☆19Dec 27, 2024Updated last year
- PhysGame Benchmark for Physical Commonsense Evaluation in Gameplay Videos☆48Jul 3, 2025Updated 7 months ago
- Generative AI for Character Animation: A Comprehensive Survey of Techniques, Applications, and Future Directions☆62May 13, 2025Updated 9 months ago
- ☆20Aug 30, 2025Updated 5 months ago
- Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation: A framework for generating multimodal music by bridging dif…☆28Jan 21, 2025Updated last year
- [ICML 2025] Official code of "AlphaDPO: Adaptive Reward Margin for Direct Preference Optimization"☆29Jan 10, 2026Updated last month