IBM/MAX-Speech-to-Text-Converter

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/IBM/MAX-Speech-to-Text-Converter)

IBM / MAX-Speech-to-Text-Converter

Converts spoken words into text form.

☆78

Alternatives and similar repositories for MAX-Speech-to-Text-Converter

Users that are interested in MAX-Speech-to-Text-Converter are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

IBM / MAX-Review-Text-Generator
View on GitHub
Generate English-language text similar to the text in the Yelp® review data set.
☆18Sep 17, 2025Updated 10 months ago
IBM / MAX-News-Text-Generator
View on GitHub
Generate English-language text similar to the news articles in the One Billion Words data set.
☆26Sep 17, 2025Updated 10 months ago
CODAIT / pardata
View on GitHub
☆17Jul 15, 2026Updated last week
IBM / MAX-Text-Summarizer
View on GitHub
Generate a summarized description of a body of text
☆27Sep 17, 2025Updated 10 months ago
vadimkantorov / inferspeech
View on GitHub
PyTorch speech2text inference script for the NVidia openseq2seq wav2letter model variant
☆10Aug 12, 2019Updated 6 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
IBM / MAX-Spatial-Transformer-Network
View on GitHub
Train a neural network component that can add spatial transformations such as translation and rotation to larger models.
☆10Apr 18, 2019Updated 7 years ago
IBM / MAX-Framework
View on GitHub
Python package that can be installed to make it easier to create MAX models
☆27May 10, 2021Updated 5 years ago
IBM / MAX-ResNet-50
View on GitHub
Identify objects in images using a first-generation deep residual network.
☆15Sep 17, 2025Updated 10 months ago
IBM / Train-Custom-Speech-Model
View on GitHub
Create a custom Watson Speech to Text model using specialized domain data
☆61Aug 31, 2021Updated 4 years ago
IBM / MAX-Fast-Neural-Style-Transfer
View on GitHub
Generate a new image that mixes the content of a source image with the style of another image.
☆50Sep 17, 2025Updated 10 months ago
rolczynski / Automatic-Speech-Recognition
View on GitHub
🎧 Automatic Speech Recognition: DeepSpeech & Seq2Seq (TensorFlow)
☆222Jun 15, 2020Updated 6 years ago
zdmc23 / oneshot-audio
View on GitHub
Experiment with "one-shot learning" techniques to recognize a voice signature
☆24Mar 29, 2020Updated 6 years ago
desh2608 / kaldi-noise-vectors
View on GitHub
Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.
☆13Feb 13, 2021Updated 5 years ago
SpeechColab / PySpeechColab
View on GitHub
A library of speech gadgets.
☆15Oct 15, 2022Updated 3 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
mayukhnair / deepspeech-colab
View on GitHub
Running Mozilla's implementation of Baidu DeepSpeech on Google Colaboratory
☆16Mar 18, 2019Updated 7 years ago
IBM / MAX-Question-Answering
View on GitHub
Answer questions on a given corpus of text.
☆32Sep 17, 2025Updated 10 months ago
gpu-poor / gramvaani_hindi_asr
View on GitHub
This repo contains the baseline model recipes and pre-trained model for GramVanni hindi ASR challenge
☆16Mar 26, 2022Updated 4 years ago
IBM / MAX-Audio-Sample-Generator
View on GitHub
Generate short audio clips of speech commands and lo-fi instrumental samples
☆22Sep 17, 2025Updated 10 months ago
IBM / MAX-Audio-Classifier
View on GitHub
Identify sounds in short audio clips
☆158Sep 17, 2025Updated 10 months ago
roiponytch / Flickering_Adversarial_Video
View on GitHub
Code and videos accompanying the paper "Flickering Adversarial Attacks against Video Recognition Networks"
☆16Dec 8, 2022Updated 3 years ago
Alvearie / quality-measure-and-cohort-service
View on GitHub
Service to evaluate quality measure and cohort specifications against a target patient data set.
☆11Jun 2, 2022Updated 4 years ago
loretoparisi / hf-experiments
View on GitHub
Experiments with Hugging Face 🔬 🤗
☆47Apr 18, 2026Updated 3 months ago
pilot7747 / VoxDIY
View on GitHub
This repository provides data and code for "Vox Populi, Vox DIY: Benchmark Dataset for Crowdsourced Audio Transcription" paper.
☆16Jul 22, 2021Updated 5 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
patyork / AutomaticSpeechChunker
View on GitHub
From a large speech audio file and its corresponding body of text, automatically chunk the audio and text into (phrase, audio_snippet) pa…
☆17May 15, 2015Updated 11 years ago
IBM / MAX-Named-Entity-Tagger
View on GitHub
Locate and tag named entities in text
☆25Sep 17, 2025Updated 10 months ago
artbataev / end2end
View on GitHub
Losses and decoders for end-to-end ASR and OCR
☆34Oct 30, 2020Updated 5 years ago
lealeasch / adversarialattacks
View on GitHub
Adversarial Attacks
☆21Oct 11, 2021Updated 4 years ago
yongxuUSTC / DNN-SpeechEnhancement
View on GitHub
DNN-based speech enhancement using Tensorflow by Haoyu Li (Tokyo univ.)
☆17Aug 31, 2017Updated 8 years ago
IBM / MAX-OCR
View on GitHub
MAX Optical Character Recognition
☆51Sep 17, 2025Updated 10 months ago
dense-analysis / vim-speech
View on GitHub
Vim Speech Recognition Experiments
☆20May 30, 2025Updated last year
domcross / german-stt-evaluation
View on GitHub
Evaluation of STT models for german language
☆16Jan 22, 2022Updated 4 years ago
uhh-lt / kaldi-model-server
View on GitHub
Simple Kaldi model server for chain (nnet3) models in online recognition mode directly from a local microphone
☆35Feb 18, 2022Updated 4 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
haitham-chabayta / contactless-respiratory-rate-measurement-app
View on GitHub
An application that measures the respiratory rate for patients using a flir one mobile thermal camera
☆12Aug 5, 2020Updated 5 years ago
charlesliucn / LanMIT
View on GitHub
📖 LanMIT: A Toolkit for Improving Language Models in Low-resourced Speech Recognition based on Kaldi.
☆22Jul 12, 2019Updated 7 years ago
JarbasAl / kaldi_spotter
View on GitHub
wake word spotting with kaldi
☆19Dec 3, 2020Updated 5 years ago
IBM / audioset-classification
View on GitHub
Train a Deep Learning model to classify audio embeddings on IBM's Deep Learning as a Service (DLaaS) platform - Watson Machine Learning
☆102Sep 17, 2025Updated 10 months ago
yc9701 / pansori-tedxkr-corpus
View on GitHub
Korean ASR Corpus generated from TEDx talks
☆27Jan 11, 2019Updated 7 years ago
BUTSpeechFIT / ASR-hybrid-decoding
View on GitHub
☆17Nov 25, 2019Updated 6 years ago
GT4SD / zero-shot-bert-adapters
View on GitHub
Implementation of Z-BERT-A: a zero-shot pipeline for unknown intent detection.
☆44Jun 13, 2023Updated 3 years ago