PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
☆14Aug 11, 2020Updated 5 years ago
Alternatives and similar repositories for SlowFast
Users that are interested in SlowFast are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Codebase for VidHal: Benchmarking Hallucinations in Vision LLMs☆14Apr 23, 2026Updated last week
- The repo for "Class-aware Sounding Objects Localization", TPAMI 2021.☆29Mar 4, 2022Updated 4 years ago
- Based on StackExchange.Redis that operates Tair For Redis Modules.☆11Feb 28, 2025Updated last year
- ☆13Nov 28, 2021Updated 4 years ago
- The official implementation of V-AURA: Temporally Aligned Audio for Video with Autoregression (ICASSP 2025) (Oral)☆33Feb 11, 2026Updated 2 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆13Jul 10, 2024Updated last year
- A paper list of Weakly Supervised Object Detection (WSOD) resources.☆13May 6, 2021Updated 4 years ago
- Revisiting Test Time Adaptation Under Online Evaluation☆36May 2, 2024Updated 2 years ago
- Self-Supervised Learning by Cross-Modal Audio-Video Clustering (NeurIPS 2020)☆91Oct 24, 2022Updated 3 years ago
- [Arxiv2022] Revitalize Region Feature for Democratizing Video-Language Pre-training☆22Mar 19, 2022Updated 4 years ago
- The dataset consists of public social media url pairs and the corresponding entailment label for an external conference (ACL 2021). Each …☆14Aug 16, 2021Updated 4 years ago
- Listen to Look: Action Recognition by Previewing Audio (CVPR 2020)☆130Aug 31, 2021Updated 4 years ago
- Unlocking the Essence of Beauty: Advanced Aesthetic Reasoning with Relative-Absolute Policy Optimization☆24Jan 27, 2026Updated 3 months ago
- Official repository for "Self-Supervised Video Transformer" (CVPR'22)☆108Jun 26, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- The repository contains the Pytorch Implementation of the paper Age invariant face recognition and retrieval by coupled auto-encoder netw…☆13Dec 17, 2022Updated 3 years ago
- Models, data, and codes for the paper: MetaAligner: Towards Generalizable Multi-Objective Alignment of Language Models☆25Sep 26, 2024Updated last year
- Official PyTorch implementation of ReWaS (AAAI'25) "Read, Watch and Scream! Sound Generation from Text and Video"☆44Dec 13, 2024Updated last year
- HDU - 在期末的时候给老师评价的小脚本,需要在控制台打开☆14May 21, 2016Updated 9 years ago
- This is an official implementation of GRIT-VLP☆20Aug 8, 2022Updated 3 years ago
- [ACL2026] Uni-MMMU : A Massive Multi-discipline Multimodal Unified Benchmark☆24Apr 13, 2026Updated 3 weeks ago
- EmoCapCLIP: Learning Transferable Facial Emotion Representations from Large-Scale Semantically Rich Captions☆21Jul 29, 2025Updated 9 months ago
- ☆11Jan 29, 2023Updated 3 years ago
- ☆14Nov 13, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A MaskGIT port from JAX to PyTorch☆18Jun 18, 2022Updated 3 years ago
- LANCE: Stress-testing Visual Models by Generating Language-guided Counterfactual Images☆31Nov 30, 2023Updated 2 years ago
- [AAAI'26] Official implementation of CMMCoT: Enhancing Complex Multi-Image Comprehension via Multi-Modal Chain-of-Thought and Memory Augm…☆12Dec 5, 2025Updated 5 months ago
- A simple demo project of cmake and google protocol buffer.☆10Dec 3, 2013Updated 12 years ago
- Shaping Visual Representations with Language for Few-shot Classification, ACL 2020☆16May 9, 2021Updated 4 years ago
- PyTorch GPU distributed training code for MIL-NCE HowTo100M☆220Jul 5, 2022Updated 3 years ago
- Guide diffusion on ImageBind embedding similarity☆29May 27, 2023Updated 2 years ago
- [CVPR 2024] KEPP: Why Not Use Your Textbook? Knowledge-Enhanced Procedure Planning of Instructional Videos☆12Sep 24, 2024Updated last year
- [NeurIPS 2025] This is the official repository for "RAD: Towards Trustworthy Retrieval-Augmented Multi-modal Clinical Diagnosis"☆27Nov 21, 2025Updated 5 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [ICME 2025] DiffusionTalker: Efficient and Compact Speech-Driven 3D Talking Head via Personalizer-Guided Distillation☆24Mar 25, 2025Updated last year
- [ACL 2025] RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios☆26Jul 2, 2025Updated 10 months ago
- Implementation of "Slow-Fast Auditory Streams for Audio Recognition, ICASSP, 2021" in PyTorch☆73Sep 27, 2021Updated 4 years ago
- Collection of open datasets in computer vision.☆13Jun 9, 2018Updated 7 years ago
- Source code for "Bi-modal Transformer for Dense Video Captioning" (BMVC 2020)☆231Apr 8, 2023Updated 3 years ago
- Implementation of the paper "Online dictionary learning for sparse coding" (Mairal et al).☆20Apr 6, 2018Updated 8 years ago
- ☆11Aug 11, 2023Updated 2 years ago