We have implemented Track # 1 for ICME 2024: Spatial Action Localization on Chaotic World dataset. Our mAP on the validation set reaches 26.62%, and if we directly use officially provided chaos_test_1fps.csv as the results of object detection, the mAP reaches 42.28%.
☆12Nov 11, 2024Updated last year
Alternatives and similar repositories for SlowFast-Meet-ViT
Users that are interested in SlowFast-Meet-ViT are comparing it to the libraries listed below
Sorting:
- [ICCV2023] Chaotic World: A Large and Challenging Benchmark for Human Behavior Understanding in Chaotic Events☆10Dec 7, 2024Updated last year
- The Notion Citation Updater is a Python script designed to automate the process of updating citation counts for academic papers stored in…☆14Oct 28, 2024Updated last year
- 台湾大学李宏毅教授《2024 春季生成式 AI 导论》课程笔记与代码仓库,涵盖大型语言模型(LLM)、扩散模型、多模态生成等前沿技术,包含作业实现、论文精读和项目实战。☆21Aug 13, 2025Updated 7 months ago
- Video classification exercise using UCF101 data for training an early-fusion and SlowFast architecture model, both using the PyTorch Ligh…☆15Jan 11, 2022Updated 4 years ago
- [WACV 2025] Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection☆16Mar 23, 2025Updated 11 months ago
- EIVideo- 交互式智能视频标注工具,几次鼠标点击即可解放双手,让视频标注更加轻松☆32Jul 4, 2022Updated 3 years ago
- [NeurIPS 2024] CHASE: Learning Convex Hull Adaptive Shift for Skeleton-based Multi-Entity Action Recognition☆16Nov 12, 2025Updated 4 months ago
- ☆11Aug 7, 2024Updated last year
- Implementation of the paper: VG4D: Vision-Language Model Goes 4D Video Recognition(ICRA 2024)☆15Apr 23, 2024Updated last year
- Code and model for the AI City Challenge (CVPR 2022) Track 3 Action Detection (Naturalistic Driving Action Recognition)☆28Jul 22, 2023Updated 2 years ago
- [CVPR 2024] Adapting Short-Term Transformers for Action Detection in Untrimmed Videos☆12Jun 11, 2024Updated last year
- Benchmarking Multi-Step Spatial Reasoning in MLLMs with LEGO-based VQA & generation tasks.☆36Jun 20, 2025Updated 9 months ago
- [CVPR 2023] STMixer: A One-Stage Sparse Action Detector☆63May 18, 2023Updated 2 years ago
- Dr. Wang' repository☆12Nov 30, 2019Updated 6 years ago
- A project about deploying a yolo server to support inferring image sent by different clients.☆10Mar 23, 2024Updated last year
- 北斗七星是一个用于集成各类型GPS定位坐标的获取和上传,进行集中存储、分析和展示,支持NMEA0183数据解析,支持外接蓝牙设备,集成常用互联网LBS定位模块☆11Sep 2, 2019Updated 6 years ago
- 使用mnn-llm对GOT-OCR2.0进行推理☆13Oct 2, 2024Updated last year
- [CICAI 2023] Implementation of the paper “Integrating Human Parsing and Pose Network for Human Action Recognition”.☆11Sep 24, 2024Updated last year
- 用C++和Python实现从头实现一个深度学习训练框架☆12Nov 22, 2020Updated 5 years ago
- MVVM kotlin CC 组件化开发☆11Mar 13, 2020Updated 6 years ago
- [CVPR 2023] Pytorch Code of MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question Answering☆17Jul 11, 2023Updated 2 years ago
- In this realtime car detection we are using YOLOV8 model also known as Ultralytics, for the detection of vehicles and deep_sort_pytorch. …☆16Feb 8, 2025Updated last year
- Pytorch I3D implmentation on Toyota Smarthome Dataset☆17Apr 23, 2022Updated 3 years ago
- [NeurIPS 2023] Official implementation of the paper "CAST: Cross-Attention in Space and Time for Video Action Recognition"☆54Dec 28, 2023Updated 2 years ago
- 一个基于OPENSCAD的农业工程机器人☆13May 20, 2020Updated 5 years ago
- 使用JAK包对KML文件解析☆10Dec 8, 2018Updated 7 years ago
- [ICCV 2023] Efficient Video Action Detection with Token Dropout and Context Refinement☆38Sep 27, 2023Updated 2 years ago
- ☆10Dec 14, 2022Updated 3 years ago
- [IROS 2023] Interactive Spatiotemporal Token Attention Network for Skeleton-based General Interactive Action Recognition☆21Jul 12, 2025Updated 8 months ago
- LSTC: Boosting Atomic Action Detection with Long-Short-Term Context☆10Sep 1, 2022Updated 3 years ago
- A web-based multi-party live television streamer☆12Feb 21, 2026Updated last month
- ☆48May 10, 2024Updated last year
- 关于AI,ML,DA,DV等的几个经典案例,包括堵车模拟(NagelSchreckenberg)、蒙特卡洛排队问题(Monte Carlo Queuing Problem)、人脸识别(RecognitionFace)、遗传算法推断图像(IconGenetic)☆10Oct 14, 2018Updated 7 years ago
- EDUVSUM is a multimodal neural architecture that utilizes state-of-the-art audio, visual and textual features to identify important tempo…☆23Mar 8, 2024Updated 2 years ago
- ☆10Jan 3, 2023Updated 3 years ago
- [NeurIPS 2022 Spotlight] VideoMAE for Action Detection☆69Feb 3, 2023Updated 3 years ago
- This project used Yolov8/AnimeGAN and Flask to accomplish the task of background segmentation , background remove and background replacem…☆12Apr 12, 2024Updated last year
- [ICASSP 2024] Official code for Slowfast Network for Continuous Sign Language Recognition☆64Jul 4, 2025Updated 8 months ago
- My optimization(With and Without gradient) library☆15Aug 28, 2025Updated 6 months ago