nianfd / RWKV-VGLinks
☆8Updated 7 months ago
Alternatives and similar repositories for RWKV-VG
Users that are interested in RWKV-VG are comparing it to the libraries listed below
Sorting:
- LLaVA-MR: Large Language-and-Vision Assistant for Video Moment Retrieval☆8Updated 7 months ago
- Benchmarking for Audio-Text and Audio-Visual Generation; Supports FAD, FD_VGG, FD_PANNs, FD_PaSST, IS_PaSST, IS_PANNs, KL_PaSST, KL_PANNs…☆21Updated 4 months ago
- SAVEn-Vid: Synergistic Audio-Visual Integration for Enhanced Understanding in Long Video Context☆5Updated 6 months ago
- ☆11Updated 3 months ago
- Welcome to AudioCIL, the toolbox for audio class-incremental learning with the most implemented methods.☆32Updated 6 months ago
- [⭐️ WACV 2025 Oral ⭐️] PETALface: Parameter Efficient Transfer Learning for Low-resolution Face Recognition☆13Updated last month
- Video Anomaly Detection with Motion and Appearance Guided Patch Diffusion Model☆11Updated 3 months ago
- ☆8Updated 5 months ago
- [ACMMM 2024] Consistent123: One Image to Highly Consistent 3D Asset Using Case-Aware Diffusion Priors☆23Updated 8 months ago
- Video Language Model for Motern AI☆9Updated 9 months ago
- ☆16Updated 6 months ago
- unofficial☆10Updated 8 months ago
- Papers of "A Survey on Large Multi-Modal Models from the Perspective of Input-Output Space Extension"☆10Updated 7 months ago
- This is the official pytorch implementation for paper: Filter, Correlate, Compress: Training-Free Token Reduction for MLLM Acceleration☆15Updated 3 months ago
- [CVPR 2024 Highlight] Official implementation of the paper: Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-…☆39Updated 2 months ago
- Implementation of "Advancing Video Anomaly Detection: A Concise Review and a New Dataset" (NeurIPS 2024). [MSAD Dataset]☆15Updated 3 months ago
- Robust End-to-end Point-Supervised Tiny Object Detection☆8Updated 2 months ago
- (AAAI 2025) Official PyTorch implementation of paper "SAUGE: Taming SAM for Uncertainty-Aligned Multi-Granularity Edge Detection".☆16Updated 2 months ago
- [NeurIPS 2024] Mixture of Experts for Audio-Visual Learning☆15Updated 5 months ago
- official repository for ATM-Traffic☆10Updated 3 months ago
- Official Implementation of Towards Open Vocabulary Video Semantic Segmentation☆10Updated 4 months ago
- DeepDubber-V1: Towards High Quality and Dialogue, Narration, Monologue Adaptive Movie Dubbing Via Multi-Modal Chain-of-Thoughts Reasoning…☆22Updated 2 weeks ago
- [CVPR 2025] Crab: A Unified Audio-Visual Scene Understanding Model with Explicit Cooperation☆50Updated last month
- ☆9Updated 6 months ago
- ☆20Updated 4 months ago
- Official Implementation of "Open-Vocabulary Audio-Visual Semantic Segmentation" [ACM MM 2024 Oral].☆29Updated 8 months ago
- The official implement of Freeze-Omni.☆13Updated this week
- 本项目主要是2025届浙江大学软件学院夏令营(AI营)的考核项目☆11Updated 4 months ago
- Official implementation of ResCLIP: Residual Attention for Training-free Dense Vision-language Inference☆39Updated 4 months ago
- 基于openai whisper-large-v3-turbo 的流式语音转文字系统☆9Updated 7 months ago