EDUVSUM is a multimodal neural architecture that utilizes state-of-the-art audio, visual and textual features to identify important temporal segments in educational videos.
☆23Mar 8, 2024Updated 2 years ago
Alternatives and similar repositories for EDUVSUM
Users that are interested in EDUVSUM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The implementation of the paper "Video Summarization using Deep Semantic Features" in ACCV'16☆18Jun 23, 2019Updated 6 years ago
- PyTorch implementation of Multi-modal Dense Video Captioning (CVPR 2020 Workshops)☆144Apr 8, 2023Updated 3 years ago
- pytorch implementation of Semantics-AssistedVideoCaptioning☆11Feb 16, 2023Updated 3 years ago
- We have implemented Track # 1 for ICME 2024: Spatial Action Localization on Chaotic World dataset. Our mAP on the validation set reaches …☆12Nov 11, 2024Updated last year
- [WACV 2025] Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection☆17Mar 23, 2025Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆17Aug 6, 2021Updated 4 years ago
- [ECCV 2020] PyTorch code of MMT (a multimodal transformer captioning model) on TVCaption dataset☆91Sep 6, 2023Updated 2 years ago
- [ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos☆127Sep 29, 2023Updated 2 years ago
- [CVPR 2024] Adapting Short-Term Transformers for Action Detection in Untrimmed Videos☆11Jun 11, 2024Updated last year
- Code for GHA (ACCV2018)☆13Oct 31, 2018Updated 7 years ago
- The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".☆57Jan 14, 2022Updated 4 years ago
- PyTorch implementation of the models described in the IEEE ICASSP 2022 paper "Is cross-attention preferable to self-attention for multi-m…☆65Mar 29, 2025Updated last year
- MemRec☆57Mar 17, 2026Updated last month
- This repository shows how to implement a basic model for multimodal entailment.☆10Aug 17, 2021Updated 4 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Attention Based Multi-modal Emotion Recognition; Stanford Emotional Narratives Dataset☆17Aug 21, 2019Updated 6 years ago
- Toolbox for IBP Coupled SPCM-CRP Hidden Markov Model. Also contains code for EM-based HMM learning and inference for Bayesian non-paramet…☆14Mar 21, 2019Updated 7 years ago
- 1st place solution to the DCASE 2020 - Task 5 - Urban Sound Tagging with Spatiotemporal Context☆16Dec 8, 2022Updated 3 years ago
- ☆13Mar 25, 2021Updated 5 years ago
- Implementation of Aligned Cluster Analysis☆18Sep 29, 2018Updated 7 years ago
- Deployed a facial emotion recognition using neural network model which predicts the emotion from faces in images, videos and live feed fr…☆11May 2, 2021Updated 5 years ago
- Submission to MediaEval 2021 Emotions and Themes in Music challenge. Noisy-student training for music emotion tagging☆11Dec 2, 2021Updated 4 years ago
- [ICIP 2022 oral] VLCap: Vision-Language with Contrastive Learning for Coherent Video Paragraph Captioning☆28Jun 28, 2023Updated 2 years ago
- A CNN audio classifier via spectrogram images.☆10Jul 21, 2017Updated 8 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆16Mar 15, 2023Updated 3 years ago
- This is the official repository for "Can GPTs Evaluate Graphic Design Based on Design Principles?".☆13Feb 10, 2025Updated last year
- Unofficial implementation for Sigmoid Loss for Language Image Pre-Training☆11Sep 26, 2023Updated 2 years ago
- Acoustic Scene Classification using transfer learning on VGGish pre-trained model☆11Jan 3, 2018Updated 8 years ago
- Official code for the ACL 2021 Findings paper "Yichi Zhang and Joyce Chai. Hierarchical Task Learning from Language Instructions with Uni…☆24Jun 28, 2021Updated 4 years ago
- Generalized Product Quantization Network For Semi-supervised Image Retrieval - CVPR 2020☆63May 27, 2024Updated last year
- Experimental treesitter based language server. 😆☆17Mar 20, 2023Updated 3 years ago
- Code4Bench: A Mutildimensional Benchmark of Codeforces Data for Different Program Analysis Techniques☆17Apr 12, 2019Updated 7 years ago
- ☆15Oct 11, 2022Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- An attempt at genre classification with convolutional neural networks and spectrograms☆15Nov 25, 2017Updated 8 years ago
- 文心千帆LLM与向量数据库Milvus使用Demo☆22Sep 14, 2023Updated 2 years ago
- A library to manipulate Inkscape SVG content using Python 3☆12Apr 28, 2021Updated 5 years ago
- Stable-diffusion-WebUI extensions, which enable tensorrt accelerated Unet for SDXL base model☆12Oct 18, 2023Updated 2 years ago
- A web app to post emoji implemented in connect-go and connect-web.☆16Dec 10, 2023Updated 2 years ago
- A TreeSitter parser for Neorg's `document.metadata` Tag☆18Mar 23, 2026Updated last month
- (TIP'2023) Concept-Aware Video Captioning: Describing Videos with Effective Prior Information☆33Dec 26, 2024Updated last year