fanyix/SlowFast

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/fanyix/SlowFast)

fanyix / SlowFast

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.

☆14

Alternatives and similar repositories for SlowFast

Users that are interested in SlowFast are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Yu-Wu / Modaily-Aware-Audio-Visual-Video-Parsing
View on GitHub
Code for CVPR 2021 paper Exploring Heterogeneous Clues for Weakly-Supervised Audio-Visual Video Parsing
☆24Dec 29, 2021Updated 4 years ago
tair-opensource / AlibabaCloud.TairSDK
View on GitHub
Based on StackExchange.Redis that operates Tair For Redis Modules.
☆11Feb 28, 2025Updated last year
XiaoYu-1123 / PreFM
View on GitHub
[NeurIPS 2025] PreFM: Online Audio-Visual Event Parsing via Predictive Future Modeling
☆20Oct 26, 2025Updated 8 months ago
FrankFundel / SGCond
View on GitHub
☆10Jun 28, 2023Updated 3 years ago
xiaoneil / LPNet
View on GitHub
☆13Nov 28, 2021Updated 4 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
SongYii / awesome-weakly-supervised-object-detection
View on GitHub
A paper list of Weakly Supervised Object Detection (WSOD) resources.
☆13May 6, 2021Updated 5 years ago
snap-research / VIMI
View on GitHub
☆13Jul 10, 2024Updated 2 years ago
MotasemAlfarra / Online_Test_Time_Adaptation
View on GitHub
Revisiting Test Time Adaptation Under Online Evaluation
☆36May 2, 2024Updated 2 years ago
ilpoviertola / V-AURA
View on GitHub
The official implementation of V-AURA: Temporally Aligned Audio for Video with Autoregression (ICASSP 2025) (Oral)
☆35Feb 11, 2026Updated 5 months ago
showlab / DemoVLP
View on GitHub
[Arxiv2022] Revitalize Region Feature for Democratizing Video-Language Pre-training
☆22Mar 19, 2022Updated 4 years ago
HumamAlwassel / XDC
View on GitHub
Self-Supervised Learning by Cross-Modal Audio-Video Clustering (NeurIPS 2020)
☆91Oct 24, 2022Updated 3 years ago
jaeseokbyun / GRIT-VLP
View on GitHub
This is an official implementation of GRIT-VLP
☆20Aug 8, 2022Updated 3 years ago
kahnchana / svt
View on GitHub
Official repository for "Self-Supervised Video Transformer" (CVPR'22)
☆109Jun 26, 2024Updated 2 years ago
myscience / variational-diffusion
View on GitHub
Unofficial implementation of Variational Diffusion Models in PyTorch (Lightning)
☆12Aug 31, 2023Updated 2 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
naver-ai / rewas
View on GitHub
Official PyTorch implementation of ReWaS (AAAI'25) "Read, Watch and Scream! Sound Generation from Text and Video"
☆44Dec 13, 2024Updated last year
hdu-coder / evaluate-teacher
View on GitHub
HDU - 在期末的时候给老师评价的小脚本，需要在控制台打开
☆14May 21, 2016Updated 10 years ago
shanwangshan / TAU-urban-audio-visual-scenes
View on GitHub
☆12Oct 23, 2021Updated 4 years ago
Rowl1ng / Structure-Aware-VR-Sketch-Shape-Retrieval
View on GitHub
☆19Aug 6, 2024Updated last year
yehonathanlitman / EditCtrl
View on GitHub
[CVPR 2026] EditCtrl: Disentangled Local and Global Control for Real-Time Generative Video Editing
☆46Jul 13, 2026Updated last week
k51 / STGSP
View on GitHub
☆11Jan 2, 2022Updated 4 years ago
fyyCS / LSLD
View on GitHub
☆14Nov 13, 2023Updated 2 years ago
hmorimitsu / maskgit-torch
View on GitHub
A MaskGIT port from JAX to PyTorch
☆18Jun 18, 2022Updated 4 years ago
virajprabhu / LANCE
View on GitHub
LANCE: Stress-testing Visual Models by Generating Language-guided Counterfactual Images
☆31Nov 30, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
jayelm / lsl
View on GitHub
Shaping Visual Representations with Language for Few-shot Classification, ACL 2020
☆16May 9, 2021Updated 5 years ago
mawenbao / protobuf-demo
View on GitHub
A simple demo project of cmake and google protocol buffer.
☆10Dec 3, 2013Updated 12 years ago
FrancescoSaverioZuppichini / yolov12
View on GitHub
☆11Jan 29, 2023Updated 3 years ago
antoine77340 / MIL-NCE_HowTo100M
View on GitHub
PyTorch GPU distributed training code for MIL-NCE HowTo100M
☆221Jul 5, 2022Updated 4 years ago
divyakkm / Data-Mining-Project
View on GitHub
Analyzing Airline data to predict delays
☆19May 15, 2014Updated 12 years ago
Demfier / philo
View on GitHub
Philo: uniting modalities. A repository with adaptive fusion techniques for multimodal data
☆26Mar 16, 2025Updated last year
laulampaul / text-animator
View on GitHub
☆20Jun 26, 2024Updated 2 years ago
luli-git / MAP
View on GitHub
MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation
☆18Sep 2, 2024Updated last year
ekazakos / auditory-slow-fast
View on GitHub
Implementation of "Slow-Fast Auditory Streams for Audio Recognition, ICASSP, 2021" in PyTorch
☆73Sep 27, 2021Updated 4 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
kushalk7 / Gesture-recognition-using-CNNLSTM
View on GitHub
Using GRIT dataset, built model combining 2D CNN to LSTM to perform real-time gesture recognition from webCam video feed. Built another m…
☆18Jun 2, 2018Updated 8 years ago
Yangyi-Chen / CoTConsistency
View on GitHub
The released data for paper "Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models".
☆34Sep 16, 2023Updated 2 years ago
stoneMo / MGN
View on GitHub
Official implementation for MGN
☆20Dec 22, 2022Updated 3 years ago
zhangguanghao523 / CMMCoT
View on GitHub
[AAAI'26] Official implementation of CMMCoT: Enhancing Complex Multi-Image Comprehension via Multi-Modal Chain-of-Thought and Memory Augm…
☆11Dec 5, 2025Updated 7 months ago
hche11 / VGGSound
View on GitHub
VGGSound: A Large-scale Audio-Visual Dataset
☆359Sep 13, 2021Updated 4 years ago
M1n9X / GraphRAG_Lite
View on GitHub
☆16Jul 12, 2024Updated 2 years ago
X-PLUG / Youku-mPLUG
View on GitHub
Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Pre-training Dataset and Benchmarks
☆307Jan 8, 2024Updated 2 years ago