VJWQ / AV-CONVLinks

This is a third party implementation of the paper "The Audio-Visual Conversational Graph: From an Egocentric-Exocentric Perspective".

☆9

Alternatives and similar repositories for AV-CONV

Users that are interested in AV-CONV are comparing it to the libraries listed below

Sorting:

BolinLai / LEGO
[ECCV2024, Oral, Best Paper Finalist]This is the official implementation of the paper "LEGO: Learning EGOcentric Action Frame Generation …
☆37Updated 3 months ago
IFICL / SLfM
Official code for the paper: [ICCV2023] Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation
☆39Updated last year
luomingshuang / M3GPT
M3GPT: An advanced multimodal, multitask framework for motion comprehension and generation.
☆13Updated 5 months ago
Holistic-Motion2D / Tender
[arXiv'24] Holistic-Motion2D: Scalable Whole-body Human Motion Generation in 2D Space
☆43Updated 7 months ago
yumingj / GroupDiff
☆10Updated 10 months ago
Sid2697 / HOI-Ref
Code implementation for paper titled "HOI-Ref: Hand-Object Interaction Referral in Egocentric Vision"
☆27Updated last year
renwang435 / video-ttt-release
☆61Updated last year
neu-vi / FleVRS
FleVRS: Towards Flexible Visual Relationship Segmentation, NeurIPS 2024
☆20Updated 5 months ago
sanjayss34 / prosepose
☆33Updated 3 weeks ago
jialuli-luka / Video-MSG
Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization
☆21Updated last month
aspirinone / CATR.github.io
☆31Updated last year
showlab / Exo2Ego-V
☆44Updated last month
Jyxarthur / AutoAD-Zero
[ACCV 2024] Official Implementation of "AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description". Junyu Xie, Tengda Han, M…
☆25Updated 4 months ago
neu-vi / Diag-HOI
☆26Updated last year
facebookresearch / EgoT2
Code release for the paper "Egocentric Video Task Translation" (CVPR 2023 Highlight)
☆32Updated last year
facebookresearch / EgoObjects
[ICCV2023] EgoObjects: A Large-Scale Egocentric Dataset for Fine-Grained Object Understanding
☆76Updated last year
facebookresearch / replay_dataset
Download scripts and tools for Replay dataset.
☆32Updated last year
elicassion / 3DTRL
Code for NeurIPS 2022 paper "Learning Viewpoint-Agnostic Visual Representations by Recovering Tokens in 3D Space"
☆20Updated 2 years ago
fanglaosi / Skeleton-in-Context
[CVPR2024] Official implementation of the paper: Skeleton-in-Context: Unified Skeleton Sequence Modeling with In-Context Learning
☆39Updated last year
ethanhe42 / dds
DDS: Delta Denoising Score PyTorch implementation
☆19Updated last year
PardoAlejo / MatchDiffusion
☆14Updated 3 weeks ago
soCzech / GenHowTo
Code for the paper "GenHowTo: Learning to Generate Actions and State Transformations from Instructional Videos" published at CVPR 2024
☆51Updated last year
EGO4D / social-interactions
☆51Updated 2 years ago
NVlabs / QLIP
[arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation
☆74Updated 3 months ago
OpenRobotLab / EgoHOD
Official implementation of EgoHOD at ICLR 2025
☆18Updated 3 months ago
rese1f / PoseDA
[ICCV 2023] Global Adaptation meets Local Generalization: Unsupervised Domain Adaptation for 3D Human Pose Estimation
☆23Updated last year
johannwyh / StyleInV
Official Implementation of ICCV 2023 paper "StyleInV: A Temporal Style Modulated Inversion Network for Unconditional Video Generation"
☆23Updated last year
steve-zeyu-zhang / InfiniMotion
InfiniMotion: Mamba Boosts Memory in Transformer for Arbitrary Long Motion Generation
☆11Updated 4 months ago
OpenGVLab / EgoExoLearn
[CVPR 2024] Data and benchmark code for the EgoExoLearn dataset
☆59Updated 9 months ago
TencentARC / Divot
Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)
☆68Updated 3 months ago