IBM/MAX-Audio-Embedding-Generator

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/IBM/MAX-Audio-Embedding-Generator)

IBM / MAX-Audio-Embedding-Generator

Generate embedding vectors from audio files

☆59

Alternatives and similar repositories for MAX-Audio-Embedding-Generator

Users that are interested in MAX-Audio-Embedding-Generator are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

vadimkantorov / inferspeech
View on GitHub
PyTorch speech2text inference script for the NVidia openseq2seq wav2letter model variant
☆10Aug 12, 2019Updated 6 years ago
f1recracker / pytorch-deeplab-v3-plus
View on GitHub
Pytorch implementation of DeepLab V3+
☆13Apr 13, 2019Updated 7 years ago
IBM / MAX-Framework
View on GitHub
Python package that can be installed to make it easier to create MAX models
☆27May 10, 2021Updated 5 years ago
domcross / german-stt-evaluation
View on GitHub
Evaluation of STT models for german language
☆16Jan 22, 2022Updated 4 years ago
cobanov / audio-embedding
View on GitHub
Extract audio embeddings from an audio file using Python
☆13Jul 25, 2023Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
IBM / MAX-Recommender
View on GitHub
Generate personalized recommendations
☆14Sep 17, 2025Updated 10 months ago
IBM / MAX-Audio-Sample-Generator
View on GitHub
Generate short audio clips of speech commands and lo-fi instrumental samples
☆22Sep 17, 2025Updated 10 months ago
IBM / MAX-Question-Answering
View on GitHub
Answer questions on a given corpus of text.
☆32Sep 17, 2025Updated 10 months ago
IBM / MAX-Breast-Cancer-Mitosis-Detector
View on GitHub
Detect whether a mitosis exists in an image of breast cancer tumor cells
☆25Sep 17, 2025Updated 10 months ago
CODAIT / node-red-contrib-model-asset-exchange
View on GitHub
Node-RED nodes for the Model Asset Exchange on IBM Developer
☆20May 8, 2020Updated 6 years ago
IBM / audioset-classification
View on GitHub
Train a Deep Learning model to classify audio embeddings on IBM's Deep Learning as a Service (DLaaS) platform - Watson Machine Learning
☆102Sep 17, 2025Updated 10 months ago
desh2608 / kaldi-noise-vectors
View on GitHub
Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.
☆13Feb 13, 2021Updated 5 years ago
sainathadapa / mediaeval-2019-moodtheme-detection
View on GitHub
4th position solution to the MediaEval - The 2019 Emotion and Themes in Music using Jamendo
☆15Nov 13, 2019Updated 6 years ago
siscale / covid-19-elk
View on GitHub
☆10Apr 22, 2020Updated 6 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
TehreemFarooqi / Preparing-a-speech-recognition-dataset-using-YouTube-videos
View on GitHub
Using YouTube to prepare a speech recognition dataset for any language
☆10Mar 30, 2021Updated 5 years ago
IBM / MAX-Audio-Classifier
View on GitHub
Identify sounds in short audio clips
☆158Sep 17, 2025Updated 10 months ago
stas6626 / IDRnd
View on GitHub
ID R&D Voice Antispoofing Challenge Solution
☆11Jul 27, 2019Updated 6 years ago
diff7 / tts-king
View on GitHub
a repository for trainabale tts multi speaker
☆14Nov 28, 2021Updated 4 years ago
gbegus / DeepPhonologyTool
View on GitHub
Train a fiwGAN or ciwGAN model using your own training data
☆14Oct 13, 2022Updated 3 years ago
microbs-io / microbs
View on GitHub
microservices observability
☆17Jul 4, 2024Updated 2 years ago
gpu-poor / gramvaani_hindi_asr
View on GitHub
This repo contains the baseline model recipes and pre-trained model for GramVanni hindi ASR challenge
☆16Mar 26, 2022Updated 4 years ago
cnlinxi / speech_emotion
View on GitHub
Detect emotion from audio
☆14Nov 20, 2018Updated 7 years ago
Yangyangii / TPGST-Tacotron
View on GitHub
Google's TPGST reimplementation.
☆34Dec 11, 2019Updated 6 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
IBM / max-tutorial-app-python
View on GitHub
A simple version of the MAX Object Detector Web App rewritten in python for use in the MAX tutorial
☆10Mar 31, 2021Updated 5 years ago
snormanhaignere / natsound165-neuron2015
View on GitHub
Code and data documenting this paper: "Distinct Cortical Pathways for Music and Speech Revealed by Hypothesis-Free Voxel Decomposition". …
☆12Jul 18, 2022Updated 4 years ago
interactiveaudiolab / voogle
View on GitHub
This is code for an audio search engine that uses vocal imitations of the desired sound
☆38May 16, 2023Updated 3 years ago
mikex86 / DeepSpeech-Java-Bindings
View on GitHub
Java Bindings for the C++ library DeepSpeech
☆10Jun 4, 2020Updated 6 years ago
Idlak / Living-Audio-Dataset
View on GitHub
A "Crowd-Built" continuously growing speech dataset with transcripts. The dataset contains multiple languages and is intended for anyone …
☆43Aug 3, 2022Updated 3 years ago
IBM / MAX-Human-Pose-Estimator
View on GitHub
Detect humans in an image and estimate the pose for each person
☆64Sep 17, 2025Updated 10 months ago
StuartMellor / Max-MSP-RNBO-CPP-Native-Android
View on GitHub
An initial foray into deploying an exported RNBO Max MSP object as an Android app.
☆14Jan 25, 2023Updated 3 years ago
sarahjuan / iban
View on GitHub
☆14Jun 12, 2015Updated 11 years ago
IBM / MAX-Sports-Video-Classifier
View on GitHub
Categorize sports videos according to which sport the video depicts.
☆24Sep 17, 2025Updated 10 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
speechpro / cloud-python
View on GitHub
Python клиент API распознавания и синтеза речи Облака ЦРТ
☆11Dec 26, 2022Updated 3 years ago
igormq / ctcdecode-pytorch
View on GitHub
Python implementation of CTC beam search decoder + agnostic LM scorer
☆20Dec 16, 2020Updated 5 years ago
multitel-ai / urban-sound-tagging
View on GitHub
1st place solution to the DCASE 2020 - Task 5 - Urban Sound Tagging with Spatiotemporal Context
☆17Dec 8, 2022Updated 3 years ago
lunixbochs / feeds
View on GitHub
transcribe audio feeds into public web ui
☆45Aug 31, 2022Updated 3 years ago
hanshounsu / d3rm
View on GitHub
☆14Feb 3, 2026Updated 5 months ago
burrmill / burrmill
View on GitHub
BurrMill core
☆22Nov 2, 2021Updated 4 years ago
idiap / inv-tn
View on GitHub
A bunch of scripts exploiting several tools to perform inverse text normalization (ITN)
☆21Sep 27, 2017Updated 8 years ago