aimclub / OCEANAILinks
Algorithms for Intelligent Assessment of Human Personality Traits based on His Multimodal Data for ranking potential candidates to perform professional responsibilities
β44Updated 8 months ago
Alternatives and similar repositories for OCEANAI
Users that are interested in OCEANAI are comparing it to the libraries listed below
Sorting:
- Whisper Speaker Identification (WSI), a cutting-edge model for multilingual speaker identification.β23Updated 5 months ago
- πΌ Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decompositionβ15Updated last year
- Video2Music: Suitable Music Generation from Videos using an Affective Multimodal Transformer modelβ185Updated last year
- π Awesome lists about Speech Emotion Recognitionβ96Updated 8 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.β67Updated last month
- SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systemsβ83Updated last year
- Whisper-Flamingo [Interspeech 2024] and mWhisper-Flamingo [IEEE SPL 2025] for Audio-Visual Speech Recognition and Translationβ177Updated last month
- [Information Fusion 2024] HiCMAE: Hierarchical Contrastive Masked Autoencoder for Self-Supervised Audio-Visual Emotion Recognitionβ111Updated this week
- Open TTS models, built for streaming on the edgeβ43Updated 5 months ago
- [Interspeech 2023] Intelligible Lip-to-Speech Synthesis with Speech Unitsβ42Updated 10 months ago
- This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen,β¦β78Updated 11 months ago
- The official implementation of "A Language Modeling Approach to Diacritic-Free Hebrew TTS"β100Updated 2 months ago
- FG 2024 Papers: Explore a comprehensive collection of research papers presented at one of the premier conferences on automatic face and gβ¦β14Updated last year
- TurnGPT: a Transformer-based Language Model for Predicting Turn-taking in Spoken Dialogβ57Updated last year
- β222Updated 3 months ago
- A python library to find differences between audio and transcriptionsβ20Updated last year
- Open source Python program for automating gain staging. part 1 of a series for automating audio processing tasks, end goal is to create aβ¦β42Updated last year
- LipSyncr is a lip reading web app based on the LipNet model that can lip read videos.β63Updated 2 years ago
- Efficient approach to speaker diarization using voice characteristics extractionβ99Updated 2 months ago
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the creβ¦β22Updated 10 months ago
- Coqui AI TTS pluginβ85Updated 2 months ago
- β55Updated last week
- SoTA open-source TTSβ81Updated 2 months ago
- Speech Emotion Recognitionβ43Updated 2 years ago
- β16Updated 4 months ago
- β14Updated last year
- Transcribe audio and video files with speaker diarization and logically grouped timestamps using Gemini Flashβ38Updated last month
- Implementation of Sesame's Conversational Speech Model for Hugging Face Transformersβ57Updated 3 months ago
- Mirror of hf.co/pyannote/speaker-diarization-3.1β26Updated last year
- text-to-audio-latent-diffusionβ37Updated 2 years ago