zhaoyue-zephyrus/AVION

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/zhaoyue-zephyrus/AVION)

zhaoyue-zephyrus / AVION

[arXiv:2309.16669] Code release for "Training a Large Video Model on a Single Machine in a Day"

☆138

Alternatives and similar repositories for AVION

Users that are interested in AVION are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

facebookresearch / LaViLa
View on GitHub
Code release for "Learning Video Representations from Large Language Models"
☆534Oct 1, 2023Updated 2 years ago
md-mohaiminul / ViS4mer
View on GitHub
☆58Dec 2, 2025Updated 7 months ago
leexinhao / ZeroI2V
View on GitHub
[ECCV 2024] ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video
☆20Jul 29, 2024Updated 2 years ago
zhaoyue-zephyrus / TeSTra
View on GitHub
Code for ECCV2022 "Real-time Online Video Detection with Temporal Smoothing Transformers"
☆119Aug 23, 2025Updated 11 months ago
showlab / EgoVLP
View on GitHub
[NeurIPS 2022] Egocentric Video-Language Pretraining
☆261May 9, 2024Updated 2 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
OpenGVLab / unmasked_teacher
View on GitHub
[ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models
☆348May 27, 2024Updated 2 years ago
HJYao00 / Side4Video
View on GitHub
☆42Apr 7, 2024Updated 2 years ago
OpenGVLab / EgoVideo
View on GitHub
[CVPR 2024 Champions][ICLR 2025] Solutions for EgoVis Chanllenges in CVPR 2024
☆136May 11, 2025Updated last year
facebookresearch / viewseg
View on GitHub
Code for "Recognizing Scenes from Novel Viewpoints"
☆29Sep 16, 2022Updated 3 years ago
tiangeluo / ShapeCompiler
View on GitHub
A Unified Framework for Transforming between Text, Point Cloud, and Program
☆19Jul 3, 2025Updated last year
Chuhanxx / helping_hand_for_egocentric_videos
View on GitHub
Implementation of paper 'Helping Hands: An Object-Aware Ego-Centric Video Recognition Model'
☆33Nov 7, 2023Updated 2 years ago
alibaba-mmai-research / DiST
View on GitHub
ICCV2023: Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning
☆41Sep 25, 2023Updated 2 years ago
OpenGVLab / efficient-video-recognition
View on GitHub
☆184Aug 20, 2022Updated 3 years ago
facebookresearch / MeMViT
View on GitHub
Code Release for MeMViT Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition, CVPR 2022
☆155Nov 30, 2022Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
epic-kitchens / epic-kitchens-slowfast
View on GitHub
☆36Mar 22, 2022Updated 4 years ago
epic-kitchens / epic-kitchens-download-scripts
View on GitHub
Download scripts for EPIC-KITCHENS
☆173Jun 10, 2026Updated last month
zihuixue / seeAoT
View on GitHub
Code and data release for the paper "Seeing the Arrow of Time in Large Multimodal Models"
☆16Oct 2, 2025Updated 9 months ago
AndongDeng / BEAR
View on GitHub
BEAR: a new BEnchmark on video Action Recognition
☆46Apr 21, 2024Updated 2 years ago
klauscc / TALLFormer
View on GitHub
☆53Jan 3, 2023Updated 3 years ago
Richard-61 / FineAction
View on GitHub
The official codebase of FineAction dataset. We will update the data and code of our FineAction.
☆24Apr 10, 2025Updated last year
UT-Austin-RPL / FORGE
View on GitHub
Code for Few-View Object Reconstruction with Unknown Categories and Camera Poses at 3DV 2024 (oral)
☆93Jan 23, 2024Updated 2 years ago
ilkerkesen / ViLMA
View on GitHub
ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models (ICLR 2024, Official Implementation)
☆16Jan 18, 2024Updated 2 years ago
XueFuzhao / HowToRunScenic
View on GitHub
☆14Nov 28, 2022Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
jayleicn / singularity
View on GitHub
[ACL 2023] Official PyTorch code for Singularity model in "Revealing Single Frame Bias for Video-and-Language Learning"
☆136May 5, 2023Updated 3 years ago
showlab / GEB-Plus
View on GitHub
[ECCV 2022] GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval
☆17Aug 24, 2022Updated 3 years ago
OpenGVLab / VideoMAEv2
View on GitHub
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
☆806Oct 8, 2024Updated last year
WangFei-2019 / SNARE
View on GitHub
Project for SNARE benchmark
☆11Jun 5, 2024Updated 2 years ago
fmthoker / SEVERE-BENCHMARK
View on GitHub
☆26Aug 31, 2023Updated 2 years ago
Hritikbansal / videocon
View on GitHub
☆58Apr 24, 2024Updated 2 years ago
showlab / MovieSeq
View on GitHub
[ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences
☆46Mar 11, 2025Updated last year
Tanveer81 / RGNet
View on GitHub
This is the official implementation of RGNet: A Unified Retrieval and Grounding Network for Long Videos
☆20Mar 3, 2025Updated last year
sauradip / DiffusionTAD
View on GitHub
[ICCV 2023] Official PyTorch implementation of the paper "DiffTAD: Temporal Action Detection with Proposal Denoising Diffusion"
☆37Mar 30, 2023Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
klauscc / VindLU
View on GitHub
☆108Dec 23, 2022Updated 3 years ago
fansunqi / AKeyS
View on GitHub
Agentic Keyframe Search for Video Question Answering
☆18Jun 30, 2026Updated last month
amitakamath / vl_text_encoders_are_bottlenecks
View on GitHub
Code and datasets for "Text encoders are performance bottlenecks in contrastive vision-language models". Coming soon!
☆11May 24, 2023Updated 3 years ago
jozhang97 / DETA
View on GitHub
Detection Transformers with Assignment
☆270Sep 16, 2023Updated 2 years ago
UT-Austin-RPL / Doduo
View on GitHub
Official PyTorch implementation of Doduo: Dense Visual Correspondence from Unsupervised Semantic-Aware Flow
☆44Feb 1, 2024Updated 2 years ago
PeisenZhao / Bottom-Up-TAL-with-MR
View on GitHub
Implementation for Bottom-Up Temporal Action Localization with Mutual Regularization (ECCV2020)
☆46Dec 2, 2020Updated 5 years ago
epic-kitchens / C5-Multi-Instance-Retrieval
View on GitHub
☆11Feb 9, 2026Updated 5 months ago