Video + CLIP Baseline for Ego4D Long Term Action Anticipation Challenge (CVPR 2022)
☆15Jul 4, 2022Updated 3 years ago
Alternatives and similar repositories for clip_baseline_LTA_Ego4d
Users that are interested in clip_baseline_LTA_Ego4d are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for the paper Seeing the Pose in the Pixels: Learning Pose-Aware Representations in Vision Transformers☆21Aug 2, 2024Updated last year
- Code for NeurIPS 2022 paper "Learning Viewpoint-Agnostic Visual Representations by Recovering Tokens in 3D Space"☆20Apr 20, 2023Updated 3 years ago
- This is the offical repository of LLAVIDAL☆24Oct 4, 2025Updated 7 months ago
- team Doggeee's solution to Ego4D LTA challenge@CVPRW23'☆14Nov 4, 2023Updated 2 years ago
- This code is provided for reproducibility of results in the paper: Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve A…☆23Feb 6, 2025Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- WACV 2024: "PathLDM: Text conditioned Latent Diffusion Model for Histopathology"☆49Jul 7, 2024Updated last year
- ☆18Dec 17, 2022Updated 3 years ago
- Implementation of paper 'Helping Hands: An Object-Aware Ego-Centric Video Recognition Model'☆33Nov 7, 2023Updated 2 years ago
- [WIP] Code for LangToMo☆21Mar 19, 2026Updated last month
- [WACV 2024] Code for "Limited Data, Unlimited Potential: A Study on ViTs Augmented by Masked Autoencoders"☆25Aug 16, 2024Updated last year
- Code for the paper: F. Ragusa, G. M. Farinella, A. Furnari. StillFast: An End-to-End Approach for Short-Term Object Interaction Anticipat…☆13Apr 11, 2023Updated 3 years ago
- Tools for Toyota Smarthome datasets☆14Nov 16, 2022Updated 3 years ago
- ☆19Sep 10, 2021Updated 4 years ago
- [CVPR 2022] Joint hand motion and interaction hotspots prediction from egocentric videos☆72Jan 29, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- An unofficial pytorch dataloader for Open X-Embodiment Datasets https://github.com/google-deepmind/open_x_embodiment☆25Jan 9, 2025Updated last year
- Affordance Grounding from Demonstration Video to Target Image (CVPR 2023)☆46Jul 26, 2024Updated last year
- Simple PyTorch Dataset for the EPIC-Kitchens-55 and EPIC-Kitchens-100 that handles frames and features (rgb, optical flow, and objects) f…☆24Jan 22, 2023Updated 3 years ago
- Environments for Active Vision Reinforcement Learning☆29Oct 10, 2024Updated last year
- Ego4d dataset repository. Download the dataset, visualize, extract features & example usage of the dataset☆588Apr 15, 2026Updated 3 weeks ago
- Pose driven attention mechanism☆44Mar 31, 2022Updated 4 years ago
- This is a repo of extension of VPN for Recognition of Activities of Daily Living☆16May 17, 2021Updated 4 years ago
- Code for the paper: Anticipative Feature Fusion Transformer for Multi-Modal Action Anticipation.☆32Aug 15, 2023Updated 2 years ago
- ☆14Jun 25, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Official code for "Disentangling Visual Embeddings for Attributes and Objects" Published at CVPR 2022☆34Aug 4, 2023Updated 2 years ago
- Official code implemtation of paper AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?☆30Sep 23, 2024Updated last year
- [NeurIPS 2022] Egocentric Video-Language Pretraining☆260May 9, 2024Updated 2 years ago
- ☆13Jul 6, 2022Updated 3 years ago
- [AAAI 2025] Official Repository of 'SKI Models: Skeleton Induced Vision-Language Embeddings for Understanding Activities of Daily Living'☆23Sep 17, 2025Updated 7 months ago
- Code for our ACL 2025 paper "Language Repository for Long Video Understanding"☆36Jun 17, 2024Updated last year
- dist☆10Dec 14, 2018Updated 7 years ago
- PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations☆17Apr 25, 2020Updated 6 years ago
- ☆24Mar 24, 2023Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- [CVPR 2026] Official Repository of 'MS-Temba: Multi-Scale Temporal Mamba for Understanding Long Untrimmed Videos'☆44Jan 23, 2026Updated 3 months ago
- [ACM MM 2024] Frequency Guidance Matters: Skeletal Action Recognition by Frequency-Aware Mixed Transformer☆21Apr 28, 2026Updated last week
- [ICLR'25] LLaRA: Supercharging Robot Learning Data for Vision-Language Policy☆229Mar 29, 2025Updated last year
- ☆10Oct 20, 2023Updated 2 years ago
- Code for CVPR2021 Paper “Cascaded Prediction Network via Segment Tree for Temporal Video Grounding”☆10Apr 3, 2022Updated 4 years ago
- This repository contains the Adverbs in Recipes (AIR) dataset and the code published at the CVPR 23 paper: "Learning Action Changes by Me…☆13May 25, 2023Updated 2 years ago
- Train I3D on NTU-RGB+D dataset in keras☆11Feb 5, 2019Updated 7 years ago