M-VAD Names Dataset. Multimedia Tools and Applications (2019)
☆24Jul 9, 2019Updated 6 years ago
Alternatives and similar repositories for mvad-names-dataset
Users that are interested in mvad-names-dataset are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Character Grounding and Re-Identification in Story of Videos and Text Descriptions☆10Jan 17, 2021Updated 5 years ago
- C/C++ startup template for developing fast immediate GUI using Dear Imgui with GLFW+GLAD☆11Nov 16, 2020Updated 5 years ago
- ☆14Aug 9, 2018Updated 7 years ago
- Poet: Product-oriented Video Captioner for E-commerce☆12Sep 21, 2020Updated 5 years ago
- Identity-Aware Multi-Sentence Video Description☆15Jun 12, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- implement video caption based on openNMT☆36Apr 19, 2018Updated 8 years ago
- Source code for Delving Deeper into the Decoder for Video Captioning☆39Jun 1, 2021Updated 4 years ago
- Uncertainty on Asynchronous Time Event Prediction (Spotlight, Neurips 2019)☆20Oct 8, 2020Updated 5 years ago
- ☆87Mar 4, 2024Updated 2 years ago
- Code for the ICCV 2011 paper"Semantic contours from inverse detectors"☆12May 15, 2012Updated 13 years ago
- Code for "A Graph-Based Framework to Bridge Movies and Synopses", ICCV2019☆52Aug 9, 2020Updated 5 years ago
- implementation of TDConvED for video captioning☆13Mar 18, 2020Updated 6 years ago
- ☆12Jan 12, 2016Updated 10 years ago
- Video content description model for generating descriptions for unconstrained videos☆15Jul 5, 2019Updated 6 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [NeurIPS 2025] Watch and Listen: Understanding Audio-Visual-Speech Moments with Multimodal LLM☆24Feb 10, 2026Updated 2 months ago
- Interactive multimedia captioning with Keras☆16Aug 2, 2019Updated 6 years ago
- Expanded Cross Neighborhood distance based Re-ranking (ECN)☆48May 14, 2020Updated 5 years ago
- Python implementation of extraction of several visual features representations from videos☆23Jul 19, 2021Updated 4 years ago
- Self-supervised Siamese network (SSiam), FG 2019☆27Apr 21, 2023Updated 2 years ago
- ☆20Sep 19, 2019Updated 6 years ago
- Code and database for Jacquot et al. CVPR 2020. Can we decode subtle human activities?☆12Dec 22, 2020Updated 5 years ago
- ☆23Jan 10, 2019Updated 7 years ago
- A PyTorch implementation of the paper Multimodal Transformer with Multiview Visual Representation for Image Captioning☆25Sep 4, 2020Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A pytorch implementation of "Robust Facial Landmark Detection by Multi-order Multi-constrained Network"☆13Dec 9, 2020Updated 5 years ago
- This repository contains the video files (download links) and corresponding annotations used in the paper "Long-Term Face Tracking for Cr…☆14Dec 18, 2020Updated 5 years ago
- ☆33Apr 20, 2018Updated 7 years ago
- Permutation invariant training in PyTorch☆13Oct 2, 2020Updated 5 years ago
- Automatically setup the AISHELL-4 and MSDWild dataset for usage with pyannote-database (and pyannote-audio)☆15Oct 22, 2025Updated 5 months ago
- Source code for the CVPR 2017 paper☆64Apr 23, 2018Updated 7 years ago
- F-16 is a powerful video large language model (LLM) that perceives high-frame-rate videos, which is developed by the Department of Electr…☆36Jul 3, 2025Updated 9 months ago
- Code for Oops! Predicting Unintentional Action in Video☆80Apr 13, 2020Updated 6 years ago
- ☆35Mar 22, 2019Updated 7 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Code for the paper Joint Discovery of Object States and Manipulation Actions, ICCV 2017☆14Aug 7, 2018Updated 7 years ago
- Deep Future Gaze: Gaze Anticipation on Egocentric Videos Using Adversarial Networks☆33Mar 12, 2020Updated 6 years ago
- A Few-Shot Learning based Approach to Multimodal Social Relation Extraction☆14Jan 17, 2023Updated 3 years ago
- Source code of the paper titled *Improving Video Captioning with Temporal Composition of a Visual-Syntactic Embedding*☆30Apr 16, 2021Updated 5 years ago
- Behavioral probing of language acquisition models at the lexical and syntactic level☆19Jul 17, 2023Updated 2 years ago
- Source code for paper "Towards Automatic Learning of Procedures from Web Instructional Videos"☆34Jan 6, 2019Updated 7 years ago
- Finalist entry for the M2CAI Workflow Challenge 2016☆10Nov 25, 2016Updated 9 years ago