ByZ0e/Glance-Focus

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ByZ0e/Glance-Focus)

ByZ0e / Glance-Focus

This repo contains source code for Glance and Focus: Memory Prompting for Multi-Event Video Question Answering (Accepted in NeurIPS 2023)

☆31

Alternatives and similar repositories for Glance-Focus

Users that are interested in Glance-Focus are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

showlab / mist
View on GitHub
☆37Dec 20, 2023Updated 2 years ago
doc-doc / CoVGT
View on GitHub
Contrastive Video Question Answering via Video Graph Transformer (IEEE T-PAMI'23)
☆20Mar 9, 2024Updated 2 years ago
Ziyang412 / UCoFiA
View on GitHub
Pytorch Code for "Unified Coarse-to-Fine Alignment for Video-Text Retrieval" (ICCV 2023)
☆66Jun 7, 2024Updated 2 years ago
bladewaltz1 / PromptSwitch
View on GitHub
☆30Aug 14, 2023Updated 2 years ago
engindeniz / vitis
View on GitHub
[ICCV 2023 CLVL Workshop] Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts
☆13Jan 13, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
yl3800 / TranSTR
View on GitHub
☆12Dec 15, 2023Updated 2 years ago
aurooj / SHG-VQA
View on GitHub
Learning Situation Hyper-Graphs for Video Question Answering
☆23Feb 16, 2024Updated 2 years ago
showlab / GEB-Plus
View on GitHub
[ECCV 2022] GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval
☆17Aug 24, 2022Updated 3 years ago
AmeenAli / VideoMatch
View on GitHub
☆14Jan 5, 2022Updated 4 years ago
huangmozhi9527 / GMMFormer
View on GitHub
[AAAI 2024] GMMFormer: Gaussian-Mixture-Model Based Transformer for Efficient Partially Relevant Video Retrieval
☆21May 10, 2024Updated 2 years ago
yl3800 / IGV
View on GitHub
This repo contains code for Invariant Grounding for Video Question Answering
☆27Mar 2, 2023Updated 3 years ago
sqiangcao99 / E2E-LOAD
View on GitHub
☆21Jul 26, 2023Updated 3 years ago
doc-doc / HQGA
View on GitHub
Video as Conditional Graph Hierarchy for Multi-Granular Question Answering (AAAI'22, Oral)
☆35Sep 17, 2022Updated 3 years ago
WHB139426 / GCG
View on GitHub
Weakly Supervised Gaussian Contrastive Grounding with Large Multimodal Models for Video Question Answering [ACM MM'24]
☆10Jul 22, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
ninatu / in_style
View on GitHub
Official implementation of "In-style: Bridging Text and Uncurated Videos with Style Transfer for Cross-modal Retrieval." ICCV 2023
☆11Oct 5, 2023Updated 2 years ago
mlvlab / Flipped-VQA
View on GitHub
Large Language Models are Temporal and Causal Reasoners for Video Question Answering (EMNLP 2023)
☆77Mar 26, 2025Updated last year
mlvlab / OVQA
View on GitHub
Open-Vocabulary Video Question Answering: A New Benchmark for Evaluating the Generalizability of Video Question Answering Models (ICCV 20…
☆18Apr 23, 2024Updated 2 years ago
csbobby / STAR_Benchmark
View on GitHub
☆36Apr 18, 2024Updated 2 years ago
cnzeki / margin-centre-face
View on GitHub
Face recognition
☆11Jun 20, 2019Updated 7 years ago
MGitHubL / TMac
View on GitHub
☆14Feb 26, 2024Updated 2 years ago
GeWu-Lab / Generalizable-Audio-Visual-Segmentation
View on GitHub
Official repository of "Prompting Segmentation with Sound is Generalizable Audio-Visual Source Localizer", AAAI 2024
☆28Mar 14, 2026Updated 4 months ago
madeleinegrunde / AGQA_baselines_code
View on GitHub
☆18Nov 1, 2023Updated 2 years ago
intel / TVP
View on GitHub
☆15Aug 4, 2025Updated 11 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
GeWu-Lab / Ref-AVS
View on GitHub
The official repo for "Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes", ECCV 2024
☆50Oct 12, 2025Updated 9 months ago
Hokhim2 / CVBench
View on GitHub
☆19Aug 28, 2025Updated 11 months ago
bofang98 / UATVR
View on GitHub
[ICCV'23] UATVR: Uncertainty-Adaptive Text-Video Retrieval
☆13Nov 5, 2023Updated 2 years ago
minghangz / SPL
View on GitHub
Generating Structured Pseudo Labels for Noise-resistant Zero-shot Video Sentence Localization
☆16Jul 20, 2023Updated 3 years ago
zhangbw17 / MV-Adapter
View on GitHub
An official pytorch implementation of the paper: [MV-Adapter: Multimodal Video Transfer Learning for Video Text Retrieval].
☆14Jul 27, 2024Updated 2 years ago
wangpengnorman / KB-Ref_dataset
View on GitHub
☆16Dec 28, 2020Updated 5 years ago
bigai-nlco / VideoTGB
View on GitHub
[EMNLP 2024] A Video Chat Agent with Temporal Prior
☆33Mar 2, 2025Updated last year
VRU-NExT / VideoQA
View on GitHub
☆104Oct 19, 2022Updated 3 years ago
antoyang / FrozenBiLM
View on GitHub
[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional Language Models
☆159Dec 9, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
kj3moraes / movieclip
View on GitHub
An experiment with movie scenes and contrastive learning
☆11Feb 1, 2025Updated last year
Huntersxsx / MGPN
View on GitHub
source code of our MGPN in SIGIR 2022
☆18Jun 8, 2022Updated 4 years ago
doc-doc / NExT-GQA
View on GitHub
Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)
☆89Jul 1, 2024Updated 2 years ago
QZ1-boy / CPGA
View on GitHub
[CVPR2024] Dataset and Code of "CPGA: Coding Priors-Guided Aggregation Network for Compressed Video Quality Enhancement".
☆14Dec 14, 2024Updated last year
knightyxp / DGL
View on GitHub
[AAAI 2024] DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval.
☆49Oct 14, 2024Updated last year
sayaknag / unbiasedSGG
View on GitHub
Official Pytorch Implementation of the framework TEMPURA proposed in our paper Unbiased Scene Graph Generation in Videos accepted by CVPR…
☆25Sep 9, 2025Updated 10 months ago
reddyav1 / RoCoG-v2
View on GitHub
RoCoG-v2 (Robot Control Gestures) is a dataset intended to support the study of synthetic-to-real and ground-to-air video domain adaptati…
☆17Mar 28, 2024Updated 2 years ago