traveler-framework / TraveLERView external linksLinks
[EMNLP 2024] TraveLER: A Modular Multi-LMM Agent Framework for Video Question-Answering
☆16Oct 31, 2024Updated last year
Alternatives and similar repositories for TraveLER
Users that are interested in TraveLER are comparing it to the libraries listed below
Sorting:
- [CVPR 2025] Official PyTorch code of "Enhancing Video-LLM Reasoning via Agent-of-Thoughts Distillation".☆54May 25, 2025Updated 8 months ago
- ☆24Oct 13, 2024Updated last year
- Learning Situation Hyper-Graphs for Video Question Answering☆22Feb 16, 2024Updated last year
- [AAAI 2026] GenMAC for Compositional Text-to-Video Generation☆32Jan 10, 2026Updated last month
- The official implementation of "Cross-modal Causal Relation Alignment for Video Question Grounding. (CVPR 2025 Highlight)"☆42Apr 27, 2025Updated 9 months ago
- ☆18Jun 10, 2025Updated 8 months ago
- ☆83Jul 16, 2023Updated 2 years ago
- ☆13Feb 2, 2025Updated last year
- ☆13Aug 28, 2024Updated last year
- ☆11Aug 29, 2025Updated 5 months ago
- Compiler plugin for performance analysis of HIP applications☆13Apr 7, 2025Updated 10 months ago
- TARS: MinMax Token-Adaptive Preference Strategy for Hallucination Reduction in MLLMs☆23Sep 21, 2025Updated 4 months ago
- Bidirectional Likelihood Estimation with Multi-Modal Large Language Models for Text-Video Retrieval (ICCV 2025 Highlight)☆20Aug 1, 2025Updated 6 months ago
- quagga☆10Apr 7, 2020Updated 5 years ago
- Code for MME-SID accepted to CIKM 2025 Full Research track.☆27Oct 29, 2025Updated 3 months ago
- ☆16Oct 9, 2024Updated last year
- Agentic Keyframe Search for Video Question Answering☆15Apr 7, 2025Updated 10 months ago
- This repository implements computer vision for real-time chessboard detection and piece recognition. Using OpenCV and Numpy, the system p…☆12Sep 24, 2024Updated last year
- ☆134Apr 16, 2025Updated 9 months ago
- ☆11Sep 1, 2020Updated 5 years ago
- Firmware for the EcoSteno stenographer keyboard☆12Feb 17, 2023Updated 2 years ago
- [EMNLP 2024 Industry track] MERLIN : Multimodal Embedding Refinement via LLM-based Iterative Navigation for Text-Video Retrieval-Rerank P…☆14Mar 4, 2025Updated 11 months ago
- Project for SNARE benchmark☆11Jun 5, 2024Updated last year
- LVAS-Agent Code Base☆22Apr 15, 2025Updated 10 months ago
- LLVM Plugin to Instrument Global Memory Accesses in CUDA Kernels☆10Jun 8, 2020Updated 5 years ago
- Code for paper "W-RAG: Weakly Supervised Dense Retrieval in RAG for Open-domain Question Answering"☆15Oct 2, 2025Updated 4 months ago
- ECCV2020_Spatial Hierarchy Aware Residual Pyramid Network for Time-of-Flight Depth Denoising☆12Sep 24, 2020Updated 5 years ago
- [MICCAI' 22] Semi-Supervised Medical Image Classification with Temporal Knowledge-Aware Regularization☆14Jun 27, 2022Updated 3 years ago
- ☆11Sep 16, 2021Updated 4 years ago
- [ICCV 2025] Object-centric Video Question Answering with Visual Grounding and Referring☆24Aug 8, 2025Updated 6 months ago
- [ICLR2023] Video Scene Graph Generation from Single-Frame Weak Supervision☆12Sep 17, 2023Updated 2 years ago
- SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability☆16May 8, 2025Updated 9 months ago
- Lightweight Multi-Level Multi-Scale Feature Fusion Network for Semantic Segmentation☆11May 31, 2021Updated 4 years ago
- ☆38Dec 19, 2025Updated last month
- Hands-On Tutorial on Building Multimodal RAG Systems☆13Apr 10, 2025Updated 10 months ago
- [ICCV'23] UATVR: Uncertainty-Adaptive Text-Video Retrieval☆13Nov 5, 2023Updated 2 years ago
- ☆12Jan 10, 2025Updated last year
- Kubernetes-native IoT gateway☆14Jul 21, 2025Updated 6 months ago
- Record GPU memory accesses of a CUDA program and visualize the access pattern in a browser☆13Nov 17, 2020Updated 5 years ago