facebookresearch / ego4d-goalstepLinks

Ego4D Goal-Step: Toward Hierarchical Understanding of Procedural Activities (NeurIPS 2023)

☆44

Alternatives and similar repositories for ego4d-goalstep

Users that are interested in ego4d-goalstep are comparing it to the libraries listed below

Sorting:

facebookresearch / EgoVLPv2
Code release for "EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone" [ICCV, 2023]
☆99Updated last year
lbaermann / qaego4d
Code and Dataset for the CVPRW Paper "Where did I leave my keys? — Episodic-Memory-Based Question Answering on Egocentric Videos"
☆27Updated last year
egoschema / EgoSchema
☆94Updated 7 months ago
OpenGVLab / EgoVideo
[CVPR 2024 Champions][ICLR 2025] Solutions for EgoVis Chanllenges in CVPR 2024
☆127Updated 2 months ago
CeeZh / LLoVi
Official implementation for "A Simple LLM Framework for Long-Range Video Question-Answering"
☆100Updated 9 months ago
facebookresearch / htstep
HT-Step is a large-scale article grounding dataset of temporal step annotations on how-to videos
☆20Updated last year
salesforce / paprika
Code for CVPR 2023 paper "Procedure-Aware Pretraining for Instructional Video Understanding"
☆50Updated 6 months ago
OpenGVLab / EgoExoLearn
[CVPR 2024] Data and benchmark code for the EgoExoLearn dataset
☆65Updated 11 months ago
PolyU-ChenLab / ETBench
👾 E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding (NeurIPS 2024)
☆60Updated 6 months ago
alanaai / EVUD
Egocentric Video Understanding Dataset (EVUD)
☆30Updated last year
facebookresearch / ProcedureVRL
[CVPR 2023] Official code for "Learning Procedure-aware Video Representation from Instructional Videos and Their Narrations"
☆54Updated last year
Becomebright / GroundVQA
Official PyTorch code of GroundVQA (CVPR'24)
☆61Updated 10 months ago
DCDmllm / Momentor
☆76Updated 8 months ago
zhaoyue-zephyrus / AVION
[arXiv:2309.16669] Code release for "Training a Large Video Model on a Single Machine in a Day"
☆133Updated last year
StanfordVL / atp-video-language
Official repo for CVPR 2022 (Oral) paper: Revisiting the "Video" in Video-Language Understanding. Contains code for the Atemporal Probe (…
☆51Updated last year
Ahnsun / merlin
[ECCV2024] Official code implementation of Merlin: Empowering Multimodal LLMs with Foresight Minds
☆94Updated last year
showlab / EgoVLP
[NeurIPS 2022] Egocentric Video-Language Pretraining
☆243Updated last year
qirui-chen / MultiHop-EgoQA
[AAAI 2025] Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos
☆26Updated 2 months ago
mu-cai / TemporalBench
TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
☆33Updated 8 months ago
Yui010206 / SeViLA
[NeurIPS 2023] Self-Chained Image-Language Model for Video Localization and Question Answering
☆187Updated last year
MikeWangWZHL / Paxion
Repo for paper: "Paxion: Patching Action Knowledge in Video-Language Foundation Models" Neurips 23 Spotlight
☆37Updated 2 years ago
houzhijian / GroundNLQ
The champion solution for Ego4D Natural Language Queries Challenge in CVPR 2023
☆17Updated last year
yellow-binary-tree / HawkEye
Official implementation of HawkEye: Training Video-Text LLMs for Grounding Text in Videos
☆42Updated last year
j-min / HiREST
Hierarchical Video-Moment Retrieval and Step-Captioning (CVPR 2023)
☆102Updated 6 months ago
imagegridworth / IG-VLM
☆138Updated 10 months ago
antoyang / FrozenBiLM
[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional Language Models
☆157Updated 7 months ago
Chuhanxx / helping_hand_for_egocentric_videos
Implementation of paper 'Helping Hands: An Object-Aware Ego-Centric Video Recognition Model'
☆33Updated last year
rxtan2 / Koala-video-llm
☆35Updated 10 months ago
kkahatapitiya / LangRepo
Language Repository for Long Video Understanding
☆32Updated last year
showlab / cosmo
☆72Updated last year