Multi-modal transformer approach for natural language query based joint video summarization and highlight detection
☆17May 23, 2024Updated last year
Alternatives and similar repositories for Visionary-Vids
Users that are interested in Visionary-Vids are comparing it to the libraries listed below
Sorting:
- A PyTorch implementation of the software used in: "A study on the use of attention for explaining video summarization" (NarSUM Workshop a…☆11Oct 20, 2023Updated 2 years ago
- ☆15Aug 4, 2025Updated 7 months ago
- Papers, codes collection of video summarization / video highlight detection / video key frame selection☆37Jul 16, 2021Updated 4 years ago
- ☆19May 19, 2024Updated last year
- Official pytorch repository for CG-DETR "Correlation-guided Query-Dependency Calibration in Video Representation Learning for Temporal Gr…☆151Aug 21, 2024Updated last year
- Official pytorch repository for "QD-DETR : Query-Dependent Video Representation for Moment Retrieval and Highlight Detection" (CVPR 2023 …☆246Aug 12, 2025Updated 6 months ago
- A PyTorch Implementation of CA-SUM from "Summarizing Videos using Concentrated Attention and Considering the Uniqueness and Diversity of …☆31Jun 29, 2022Updated 3 years ago
- Implementation of Cross-category Video Highlight Detection via Set-based Learning (ICCV 2021).☆79Aug 27, 2021Updated 4 years ago
- ☆40Apr 16, 2024Updated last year
- Example application for creating an MVC Express + Node + TypeScript app and deploying it to Azure☆10Nov 8, 2018Updated 7 years ago
- 📦 A collection of pastable code gathered from past projects☆12Sep 9, 2024Updated last year
- ☆34Jun 2, 2023Updated 2 years ago
- cross modal background suppression for audio-visual event localization☆36Mar 18, 2022Updated 3 years ago
- This is the official repository of the papers "Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers" and "Efficient Fi…☆38Jul 31, 2024Updated last year
- [CVPR 2024] MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos☆37Jan 29, 2025Updated last year
- [NeurIPS 2021] Moment-DETR code and QVHighlights dataset☆344Apr 18, 2024Updated last year
- UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or …☆236Apr 15, 2024Updated last year
- This repository contains the codebase for MovieCLIP: Visual Scene Recognition in Movies☆42Oct 1, 2023Updated 2 years ago
- Speech understanding system training toolkit, including tasks of ASR, SSL, LM, etc.☆11Feb 12, 2026Updated 3 weeks ago
- The implementation codes of paper: Multimodal Sentiment Analysis with Mutual Information-based Disentangled Representation Learning☆18May 8, 2025Updated 10 months ago
- A codebase for data crawling and preprocessing for TTS and ASR systems training.☆22Feb 26, 2026Updated last week
- The code and data for "Summary-Oriented Vision Modeling for Multimodal Abstractive Summarization"☆11May 16, 2023Updated 2 years ago
- ☆11Apr 20, 2023Updated 2 years ago
- Implementation of "Look, Listen and Recognise:character-aware audio-visual subtitling"☆19Nov 3, 2025Updated 4 months ago
- [ICCV 2023] UniVTG: Towards Unified Video-Language Temporal Grounding☆376May 8, 2024Updated last year
- Changes in this fork has been merged to upstream.☆16Jun 10, 2025Updated 8 months ago
- SAM4SS: Tailoring SAM and SAM2 for Semantic Segmentation☆11Jul 31, 2024Updated last year
- ☆10Oct 16, 2025Updated 4 months ago
- ☆10Aug 1, 2022Updated 3 years ago
- ☆10Jan 18, 2024Updated 2 years ago
- ☆14Aug 21, 2017Updated 8 years ago
- We present a study of a neural network based method for speech emotion recognition, using audio-only features. In the studied scheme, the…☆11Jul 24, 2024Updated last year
- SSL Video Representation Learning project☆14Jul 8, 2025Updated 8 months ago
- assistant that runs entirely on‑device on Apple‑silicon Macs (M‑series). Chats with a 4‑bit Llama‑3 model accelerated by MLX, and speak…☆14Jun 13, 2025Updated 8 months ago
- ☆11Sep 29, 2023Updated 2 years ago
- Generalized Method of Moments estimation☆13Mar 23, 2025Updated 11 months ago
- Cross-platform React components for ReactDOM and React Native☆10Jan 4, 2023Updated 3 years ago
- Firebase application template built on moltres framework☆12Apr 17, 2023Updated 2 years ago
- A tracery Twitter bot, generating graphic scores to inspire musicians, composers, and anyone else.☆10Mar 12, 2016Updated 9 years ago