JiazuoYu / PathWeave
Code for paper "LLMs Can Evolve Continually on Modality for X-Modal Reasoning" NeurIPS2024
☆35Updated 4 months ago
Alternatives and similar repositories for PathWeave
Users that are interested in PathWeave are comparing it to the libraries listed below
Sorting:
- [CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"☆68Updated 6 months ago
- Official PyTorch code of GroundVQA (CVPR'24)☆60Updated 8 months ago
- Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)☆70Updated 10 months ago
- ✨A curated list of papers on the uncertainty in multi-modal large language model (MLLM).☆44Updated last month
- 👾 E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding (NeurIPS 2024)☆58Updated 3 months ago
- [CVPR 2024] Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension☆49Updated last year
- The official repository for paper "PruneVid: Visual Token Pruning for Efficient Video Large Language Models".☆38Updated 2 months ago
- [NeurIPS 2023] The official implementation of SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation☆31Updated last year
- [BMVC 2023] Zero-shot Composed Text-Image Retrieval☆54Updated 5 months ago
- Official implementation of HawkEye: Training Video-Text LLMs for Grounding Text in Videos☆41Updated last year
- Code for the paper: "SuS-X: Training-Free Name-Only Transfer of Vision-Language Models" [ICCV'23]☆99Updated last year
- A lightweight codebase for referring expression comprehension and segmentation☆54Updated 2 years ago
- ☆71Updated 5 months ago
- [NeurIPS 2024] Visual Perception by Large Language Model’s Weights☆45Updated last month
- ☆35Updated last year
- [ICLR 2025] TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning☆33Updated last month
- [CVPR 2024] Context-Guided Spatio-Temporal Video Grounding☆53Updated 10 months ago
- ☆17Updated 5 months ago
- (CVPR2024) MeaCap: Memory-Augmented Zero-shot Image Captioning☆47Updated 8 months ago
- [ICML2024] Repo for the paper `Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models'☆20Updated 4 months ago
- Envolving Temporal Reasoning Capability into LMMs via Temporal Consistent Reward☆35Updated last month
- [CVPRW-25 MMFM] Official repository of paper titled "How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite fo…☆47Updated 8 months ago
- [CVPR 2025] Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention☆33Updated 9 months ago
- This repo holds the official code and data for "Beyond Literal Descriptions: Understanding and Locating Open-World Objects Aligned with H…☆17Updated 11 months ago
- ☆29Updated 8 months ago
- ☆92Updated last year
- Open-vocabulary Video Instance Segmentation Codebase built upon Detectron2, which is really easy to use.☆23Updated 4 months ago
- [CVPR 2024] Official PyTorch implementation of the paper "One For All: Video Conversation is Feasible Without Video Instruction Tuning"☆32Updated last year
- A comprehensive survey of Composed Multi-modal Retrieval (CMR), including Composed Image Retrieval (CIR) and Composed Video Retrieval (CV…☆31Updated 2 months ago
- PyTorch code for "Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training"☆34Updated last year