EDUVSUM is a multimodal neural architecture that utilizes state-of-the-art audio, visual and textual features to identify important temporal segments in educational videos.
☆23Mar 8, 2024Updated 2 years ago
Alternatives and similar repositories for EDUVSUM
Users that are interested in EDUVSUM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- pytorch implementation of Semantics-AssistedVideoCaptioning☆11Feb 16, 2023Updated 3 years ago
- This repository contains the code for our ICASSP paper `Speech Emotion Recognition using Semantic Information` https://arxiv.org/pdf/2103…☆27Mar 18, 2021Updated 5 years ago
- We have implemented Track # 1 for ICME 2024: Spatial Action Localization on Chaotic World dataset. Our mAP on the validation set reaches …☆12Nov 11, 2024Updated last year
- [WACV 2025] Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection☆16Mar 23, 2025Updated last year
- ☆17Aug 6, 2021Updated 4 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Extension of hLSTMat☆19Apr 15, 2021Updated 4 years ago
- ☆11Aug 7, 2024Updated last year
- [ECCV 2020] PyTorch code of MMT (a multimodal transformer captioning model) on TVCaption dataset☆91Sep 6, 2023Updated 2 years ago
- [ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos☆127Sep 29, 2023Updated 2 years ago
- [CVPR 2024] Adapting Short-Term Transformers for Action Detection in Untrimmed Videos☆12Jun 11, 2024Updated last year
- The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".☆57Jan 14, 2022Updated 4 years ago
- PyTorch implementation of the models described in the IEEE ICASSP 2022 paper "Is cross-attention preferable to self-attention for multi-m…☆64Mar 29, 2025Updated 11 months ago
- Code supporting the ISMIR 2020 Klio Tutorial☆20Oct 11, 2020Updated 5 years ago
- 1st place solution to the DCASE 2020 - Task 5 - Urban Sound Tagging with Spatiotemporal Context☆16Dec 8, 2022Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Official PyTorch implementation for "Adaptive Multi-scale Online Likelihood Network for AI-assisted Interactive Segmentation" (MONet)☆12Mar 28, 2023Updated 2 years ago
- VIA modification for sign language annotation☆18Apr 30, 2021Updated 4 years ago
- flask+tornado based NVIDIA tacotron2+waveglow tts web app☆28May 25, 2023Updated 2 years ago
- ☆11May 18, 2022Updated 3 years ago
- Submission to MediaEval 2021 Emotions and Themes in Music challenge. Noisy-student training for music emotion tagging☆11Dec 2, 2021Updated 4 years ago
- [ICIP 2022 oral] VLCap: Vision-Language with Contrastive Learning for Coherent Video Paragraph Captioning☆28Jun 28, 2023Updated 2 years ago
- Unofficial implementation for Sigmoid Loss for Language Image Pre-Training☆11Sep 26, 2023Updated 2 years ago
- The PyTorch code of the AAAI2021 paper "Non-Autoregressive Coarse-to-Fine Video Captioning".☆58Oct 22, 2023Updated 2 years ago
- Acoustic Scene Classification using transfer learning on VGGish pre-trained model☆11Jan 3, 2018Updated 8 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- CPU Physically Based Path Tracer Engine☆15May 14, 2021Updated 4 years ago
- 本项目是基于讯飞星火的智能数据分析平台☆26Aug 28, 2024Updated last year
- The recipes to build the third-party libraries for MeVisLab☆15Dec 11, 2025Updated 3 months ago
- An adjustment of the existing Virtual Makeup repository https://github.com/srivatsan-ramesh/Virtual-Makeup and https://github.com/badarsh…☆11Mar 13, 2020Updated 6 years ago
- Official implementation of our EMNLP 2022 paper "CPL: Counterfactual Prompt Learning for Vision and Language Models"☆35Dec 5, 2022Updated 3 years ago
- Stable-diffusion-WebUI extensions, which enable tensorrt accelerated Unet for SDXL base model☆12Oct 18, 2023Updated 2 years ago
- Provides current Voreen Sources (with modifications) by Uni Münster to build voreen for PC, server or lrz cluster, including workspaces a…☆12Mar 2, 2024Updated 2 years ago
- The material is covered in my YouTube playlist "Data Wrangling with Python" available on YUNIKARN.☆15Dec 9, 2025Updated 3 months ago
- ☆15Sep 16, 2021Updated 4 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- [ACMMM 2022] ReCoRo: Region-Controllable Robust Light Enhancement by User-Specified Imprecise Masks☆15Feb 6, 2023Updated 3 years ago
- simplyDICOM for Android☆17Feb 18, 2020Updated 6 years ago
- ☆10Nov 10, 2021Updated 4 years ago
- ☆13Feb 8, 2017Updated 9 years ago
- Tencent_AILab_ChineseEmbedding☆12Dec 30, 2018Updated 7 years ago
- A fine multimodality fusion network :)☆11Aug 9, 2021Updated 4 years ago
- Test Code for Super Resolution in MRI☆11Sep 17, 2018Updated 7 years ago