Multimodal short video classification task, integrating video, image, audio and text modes for short video classification
☆19Mar 12, 2020Updated 6 years ago
Alternatives and similar repositories for Multimodal-short-video-classification
Users that are interested in Multimodal-short-video-classification are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 500,000 multimodal short video data and baseline models. 50万条多模态短视频数据集和基线模型(TensorFlow2.0)。☆135Jul 23, 2019Updated 6 years ago
- Hand Gesture Controlled Tello Drone using Python and OpenCV 2021☆11Jun 6, 2022Updated 3 years ago
- ☆11May 18, 2022Updated 3 years ago
- Implementation of the paper "Real-Time Emotion Recognition via Attention Gated Hierarchical Memory Network" in AAAI-2020.☆31Sep 2, 2022Updated 3 years ago
- Annotated dataset of quadrotor Eagle for object detection of UAVs☆15Apr 4, 2022Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Build Your Own Delivery Robot - Teleoperated, autonomous, lightweight and weatherproof. Free for personal use.☆13Oct 18, 2022Updated 3 years ago
- Multimodal Fusion, Multimodal Sentiment Analysis☆26Jun 20, 2020Updated 5 years ago
- This paper has been accepted in ACM ICMR 2021.☆20Nov 17, 2025Updated 4 months ago
- ☆12Jan 16, 2022Updated 4 years ago
- (Competition) 6th -- Scene-Text-Detection-and-Recognition.☆10Jun 14, 2022Updated 3 years ago
- A multimodal UAV assistant dataset.☆11Jun 14, 2021Updated 4 years ago
- Autonomous Exploration of mobile robots in unknown environments using Deep Reinforcement learning☆14Oct 28, 2023Updated 2 years ago
- 多模态数据融合:为了完成多模态数据融合,首先利用VGG16网络和cifar10数据集完成多输入网络的分类,在VGG16的基础之上,将前三层特征提取网络作为不同输入的特征提取网络,在中间层进行特征拼接,后面的卷积层用于提取融合特征,最后加上全连接层。该网络稍作修改就能同时提取…☆102Sep 25, 2020Updated 5 years ago
- Mobile exploration robot with the ability to release an auxiliary drone to increase its sensing and operational capabilities.☆12Mar 16, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- ☆12Oct 13, 2017Updated 8 years ago
- ☆25Jun 3, 2020Updated 5 years ago
- Fine-Grained Visual Classification on Stanford Cars Dataset☆12Jun 21, 2022Updated 3 years ago
- R package for computing Utterance Emotion Dynamics☆22Jun 13, 2021Updated 4 years ago
- Multimodal classification solution for the SIGIR eCOM using Co-attention and transformer language models☆19Aug 17, 2020Updated 5 years ago
- Engaged in research to help improve to boost text sentiment analysis using facial features from video using machine learning.☆32Jan 12, 2018Updated 8 years ago
- autodock is a state machine based auto docking solution for differential-drive robot, allows accurate and reliable docking. Part of Secur…☆14Jun 28, 2025Updated 8 months ago
- Adding semantic segmentation into ORB-SLAM2 to build the point cloud for both background and objects.☆14Oct 27, 2023Updated 2 years ago
- Official PyTorch implementation of Multilogue-Net (Best paper runner-up at Challenge-HML @ ACL 2020)☆57Dec 8, 2022Updated 3 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [KDD'22] Partial Label Learning with Discrimination Augmentation☆10May 21, 2024Updated last year
- Converting VIS json label to VOS format☆12Feb 16, 2021Updated 5 years ago
- 多模态视频分类模型☆30Nov 23, 2022Updated 3 years ago
- This repository shows how to implement a basic model for multimodal entailment.☆10Aug 17, 2021Updated 4 years ago
- A smart autonomous drone with Object Tracking and Object Detection capabilities☆16Jul 3, 2022Updated 3 years ago
- Multi-model analysis of sentiment and emotion in multi-speaker conversations.☆28Jul 6, 2023Updated 2 years ago
- Multimodal late fusion for deepfake detection using video and audio data☆12May 7, 2019Updated 6 years ago
- 利用小程序本地存储封装的激励视频版积分系统☆11Jun 19, 2019Updated 6 years ago
- The inference of DINOv2 ONNX models using the ONNXRuntime library.☆20Apr 24, 2025Updated 11 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Real-time visual Simultaneous Localization and Mapping using ORB-SLAM2 for a DJI Tello Drone☆14Apr 9, 2020Updated 5 years ago
- Large Language-and-Vision Assistant for BioMedicine, built towards multimodal GPT-4 level capabilities.☆10Nov 29, 2023Updated 2 years ago
- AI Drone with embedded GPU (Nvidia Jetson Nano) for computer vision and autonomous flight☆19Nov 29, 2021Updated 4 years ago
- 1st place solution to the DCASE 2020 - Task 5 - Urban Sound Tagging with Spatiotemporal Context☆16Dec 8, 2022Updated 3 years ago
- Implement a GRU/LSTM model using Keras, and train it to classify the languages using MFCC features☆25Aug 2, 2024Updated last year
- Implement attention model to LSTM using TensorFlow☆10Jul 3, 2018Updated 7 years ago
- ROS (Robot Operating System) nodes for traffic sign detection with YOLOv7 and ArUco marker detection and mapping☆16Feb 7, 2023Updated 3 years ago