Code on selecting an action based on multimodal inputs. Here in this case inputs are voice and text.
☆73Jun 7, 2021Updated 4 years ago
Alternatives and similar repositories for Multimodal-action-recognition
Users that are interested in Multimodal-action-recognition are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Multimodal late fusion for deepfake detection using video and audio data☆12May 7, 2019Updated 6 years ago
- Generalized cross-modal NNs; new audiovisual benchmark (IEEE TNNLS 2019)☆31Apr 13, 2020Updated 5 years ago
- Pytorch implementation of DSR-RL for Video Summarization Task☆12Aug 30, 2021Updated 4 years ago
- Multimodal speech recognition using lipreading (with CNNs) and audio (using LSTMs). Sensor fusion is done with an attention network.☆69Nov 19, 2022Updated 3 years ago
- Chinese BERT classification with tf2.0 and audio classification with mfcc☆14Dec 2, 2020Updated 5 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- collection of skeleton-based human action recognition☆10Jun 28, 2020Updated 5 years ago
- ☆108Aug 24, 2022Updated 3 years ago
- (2020) Video Classification Neural Network☆30Feb 18, 2020Updated 6 years ago
- Multimodal datasets.☆34Jan 26, 2024Updated 2 years ago
- A collection of multimodal datasets, and visual features for VQA and captionning in pytorch. Just run "pip install multimodal"☆84Feb 25, 2022Updated 4 years ago
- ☆15Aug 13, 2020Updated 5 years ago
- A search engine implementation using OpenAI's clip model☆10Jun 20, 2021Updated 4 years ago
- Pretraining summarization models using a corpus of nonsense☆13Sep 28, 2021Updated 4 years ago
- Emotion analysis on DREAMER dataset using various Deep Learning Techniques☆13Jan 1, 2021Updated 5 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- A Pytorch implementation of emotion recognition from videos☆18Sep 15, 2020Updated 5 years ago
- PLLay: Efficient Topological Layer based on Persistence Landscapes☆23Dec 10, 2020Updated 5 years ago
- This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as mul…☆908Mar 15, 2023Updated 3 years ago
- Code for paper "Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition"☆18Jun 21, 2023Updated 2 years ago
- Transformer for Action Recognition in PyTorch☆38Mar 14, 2020Updated 6 years ago
- Implementation of "With a Little Help from my Temporal Context: Multimodal Egocentric Action Recognition, BMVC, 2021" in PyTorch☆20Dec 16, 2021Updated 4 years ago
- The official code of paper "Multi-to-Single: Reducing Multimodal Dependency in Emotion Recognition through Contrastive Learning" (AAAI 20…☆30Sep 30, 2025Updated 5 months ago
- Self-Supervised Learning by Cross-Modal Audio-Video Clustering (NeurIPS 2020)☆91Oct 24, 2022Updated 3 years ago
- FG2021: Cross Attentional AV Fusion for Dimensional Emotion Recognition☆33Nov 29, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Code for the paper: Audio-Visual Model Distillation Using Acoustic Images☆21Mar 24, 2023Updated 3 years ago
- Codebase for the paper: "TIM: A Time Interval Machine for Audio-Visual Action Recognition"☆54Nov 7, 2024Updated last year
- Pytorch implementation of Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs (Interspeech, 2020)☆75Sep 16, 2020Updated 5 years ago
- Autonomous Exploration of mobile robots in unknown environments using Deep Reinforcement learning☆14Oct 28, 2023Updated 2 years ago
- An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"☆365Jul 25, 2024Updated last year
- Welcome to the DeepTrack GitHub repository, a cutting-edge solution for vehicle trajectory prediction in the realm of intelligent transpo…☆19Oct 27, 2023Updated 2 years ago
- Pytorch implementation of RNN, CNN, BiGRU and LSTM for text classifcation☆10Apr 30, 2021Updated 4 years ago
- Source code for ScaleGrad☆19Dec 28, 2021Updated 4 years ago
- Simple phoenix setup for padded window management☆13Apr 25, 2018Updated 7 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Code used to run experiments for the ICLR 2023 paper "Computational Language Acquisition with Theory of Mind".☆15Apr 27, 2023Updated 2 years ago
- PyTorch Implementation on Paper [CVPR2021]Distilling Audio-Visual Knowledge by Compositional Contrastive Learning☆89Jul 7, 2021Updated 4 years ago
- Resources for: Cross-Lingual Disaster-related Multi-label Tweet Classification with Manifold Mixup (ACL SRW 2020)☆11Sep 9, 2021Updated 4 years ago
- This project is a PyTorch implementation that uses deep CNN to recognize multi-digit numbers using the SVHN dataset derived from Google S…☆23Nov 3, 2025Updated 4 months ago
- ☆11Feb 17, 2017Updated 9 years ago
- 计算机图形学课程设计带报告,OpenGL、Qt,图形绘制系统,画图板,release版,exe直接运行☆11Feb 9, 2022Updated 4 years ago
- [NeurIPS 2024 Spotlight] Code for the paper "Flex-MoE: Modeling Arbitrary Modality Combination via the Flexible Mixture-of-Experts"☆75Jun 9, 2025Updated 9 months ago