Code on selecting an action based on multimodal inputs. Here in this case inputs are voice and text.
☆73Jun 7, 2021Updated 5 years ago
Alternatives and similar repositories for Multimodal-action-recognition
Users that are interested in Multimodal-action-recognition are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official Pytorch Implementation for Continual Learning For On-Device Environmental Sound Classification☆14Jul 19, 2022Updated 3 years ago
- Multimodal late fusion for deepfake detection using video and audio data☆12May 7, 2019Updated 7 years ago
- Generalized cross-modal NNs; new audiovisual benchmark (IEEE TNNLS 2019)☆31Apr 13, 2020Updated 6 years ago
- MMAct Challenge☆13Jun 20, 2021Updated 4 years ago
- Multimodal speech recognition using lipreading (with CNNs) and audio (using LSTMs). Sensor fusion is done with an attention network.☆69Nov 19, 2022Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Chinese BERT classification with tf2.0 and audio classification with mfcc☆14Dec 2, 2020Updated 5 years ago
- collection of skeleton-based human action recognition☆10Jun 28, 2020Updated 5 years ago
- ☆108Aug 24, 2022Updated 3 years ago
- (2020) Video Classification Neural Network☆30Feb 18, 2020Updated 6 years ago
- Multimodal datasets.☆34Jan 26, 2024Updated 2 years ago
- A collection of multimodal datasets, and visual features for VQA and captionning in pytorch. Just run "pip install multimodal"☆83Feb 25, 2022Updated 4 years ago
- ☆15Aug 13, 2020Updated 5 years ago
- A Pytorch implementation of emotion recognition from videos☆18Sep 15, 2020Updated 5 years ago
- Annotated dataset of quadrotor Eagle for object detection of UAVs☆15Apr 4, 2022Updated 4 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as mul…☆918Mar 15, 2023Updated 3 years ago
- ☆12May 20, 2026Updated 3 weeks ago
- Code for paper "Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition"☆18Jun 21, 2023Updated 2 years ago
- A web-based template for hosting systems for real-time music HCI.☆13Jul 6, 2024Updated last year
- video summarization lstm-gan pytorch implementation☆27Dec 6, 2019Updated 6 years ago
- Implementation of "With a Little Help from my Temporal Context: Multimodal Egocentric Action Recognition, BMVC, 2021" in PyTorch☆20Dec 16, 2021Updated 4 years ago
- Zicx's Notebook.☆11Nov 7, 2025Updated 7 months ago
- The code of paper "Enhancing Information Maximization with Distance-Aware Contrastive Learning for Source-Free Cross-Domain Few-Shot Lear…☆15Aug 28, 2025Updated 9 months ago
- crawl profiles of Japanese PornStars from Javhoo.com☆12Feb 8, 2020Updated 6 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Official PyTorch implementation of paper Leveraging Unimodal Self Supervised Learning for Multimodal Audio-Visual Speech Recognition (ACL…☆68Jul 13, 2022Updated 3 years ago
- ☆18Oct 16, 2024Updated last year
- The official code of paper "Multi-to-Single: Reducing Multimodal Dependency in Emotion Recognition through Contrastive Learning" (AAAI 20…☆33Sep 30, 2025Updated 8 months ago
- PROJETO FINAL - Comunicação por luz visível (VLC):Análise e síntese por simulação utilizando o MATLAB. Autores: Gabrielle Cristina de Sou…☆10Mar 1, 2021Updated 5 years ago
- acnn for text-independent speaker recognition☆10Feb 8, 2022Updated 4 years ago
- Awesome GNN Learning For beginners☆16Oct 18, 2021Updated 4 years ago
- ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning. In ICCV, 2021.☆64Nov 18, 2021Updated 4 years ago
- Official code for "Weakly Supervised Two-Stage Training Scheme for Deep Video Fight Detection Model"☆12Oct 29, 2022Updated 3 years ago
- (Competition) 6th -- Scene-Text-Detection-and-Recognition.☆10Jun 14, 2022Updated 4 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Official repository of "TDSD: Text-Driven Scene-Decoupled Weakly Supervised Video Anomaly Detection"☆11May 25, 2025Updated last year
- FG2021: Cross Attentional AV Fusion for Dimensional Emotion Recognition☆34Nov 29, 2024Updated last year
- Deep learning-aided muticarrier systems (MC-AE)☆10Dec 8, 2020Updated 5 years ago
- Codebase for the paper: "TIM: A Time Interval Machine for Audio-Visual Action Recognition"☆54Nov 7, 2024Updated last year
- Pytorch implementation of Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs (Interspeech, 2020)☆74Sep 16, 2020Updated 5 years ago
- Autonomous Exploration of mobile robots in unknown environments using Deep Reinforcement learning☆14Oct 28, 2023Updated 2 years ago
- An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"☆365Jul 25, 2024Updated last year