Converts spoken words into text form.
☆78Sep 17, 2025Updated 9 months ago
Alternatives and similar repositories for MAX-Speech-to-Text-Converter
Users that are interested in MAX-Speech-to-Text-Converter are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Generate English-language text similar to the text in the Yelp® review data set.☆18Sep 17, 2025Updated 9 months ago
- Generate English-language text similar to the news articles in the One Billion Words data set.☆26Sep 17, 2025Updated 9 months ago
- PyTorch speech2text inference script for the NVidia openseq2seq wav2letter model variant☆10Aug 12, 2019Updated 6 years ago
- Train a neural network component that can add spatial transformations such as translation and rotation to larger models.☆10Apr 18, 2019Updated 7 years ago
- Identify objects in an image, additionally assigning each pixel of the image to a particular object☆31Sep 17, 2025Updated 9 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Protect communications with adversarial neural cryptography.☆11Oct 31, 2018Updated 7 years ago
- Create a custom Watson Speech to Text model using specialized domain data☆60Aug 31, 2021Updated 4 years ago
- Generate a new image that mixes the content of a source image with the style of another image.☆50Sep 17, 2025Updated 9 months ago
- Generate personalized recommendations☆14Sep 17, 2025Updated 9 months ago
- Image classifier for physical places/locations, based on the Places365-CNN Model☆42Sep 17, 2025Updated 9 months ago
- 🎧 Automatic Speech Recognition: DeepSpeech & Seq2Seq (TensorFlow)☆222Jun 15, 2020Updated 6 years ago
- Experiment with "one-shot learning" techniques to recognize a voice signature☆24Mar 29, 2020Updated 6 years ago
- Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.☆13Feb 13, 2021Updated 5 years ago
- Running Mozilla's implementation of Baidu DeepSpeech on Google Colaboratory☆16Mar 18, 2019Updated 7 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Categorize sports videos according to which sport the video depicts.☆24Sep 17, 2025Updated 9 months ago
- This repo contains the baseline model recipes and pre-trained model for GramVanni hindi ASR challenge☆16Mar 26, 2022Updated 4 years ago
- IBM Code Model Asset Exchange: Show and Tell Image Caption Generator☆82Sep 17, 2025Updated 9 months ago
- Evaluation of STT models for german language☆15Jan 22, 2022Updated 4 years ago
- A simple version of the MAX Object Detector Web App rewritten in python for use in the MAX tutorial☆10Mar 31, 2021Updated 5 years ago
- ☆12Jun 1, 2026Updated 2 weeks ago
- Identify sounds in short audio clips☆158Sep 17, 2025Updated 9 months ago
- Hacks helping with semi-almost-usable declarative NixOS sandboxing☆12Aug 14, 2024Updated last year
- Service to evaluate quality measure and cohort specifications against a target patient data set.☆11Jun 2, 2022Updated 4 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Experiments with Hugging Face 🔬 🤗☆47Apr 18, 2026Updated 2 months ago
- This repository provides data and code for "Vox Populi, Vox DIY: Benchmark Dataset for Crowdsourced Audio Transcription" paper.☆16Jul 22, 2021Updated 4 years ago
- From a large speech audio file and its corresponding body of text, automatically chunk the audio and text into (phrase, audio_snippet) pa…☆17May 15, 2015Updated 11 years ago
- Losses and decoders for end-to-end ASR and OCR☆34Oct 30, 2020Updated 5 years ago
- Generate embedding vectors from audio files☆59Sep 17, 2025Updated 9 months ago
- Adversarial Attacks☆21Oct 11, 2021Updated 4 years ago
- Adapt Kaldi-ASR nnet3 chain models from Zamia-Speech.org to a different language model☆33Jan 26, 2020Updated 6 years ago
- Detect emotion from audio☆14Nov 20, 2018Updated 7 years ago
- Vim Speech Recognition Experiments☆20May 30, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Simple Kaldi model server for chain (nnet3) models in online recognition mode directly from a local microphone☆35Feb 18, 2022Updated 4 years ago
- ☆12Jun 2, 2019Updated 7 years ago
- Lisp-like functional programming language written in Go☆11Jun 21, 2020Updated 5 years ago
- 📖 LanMIT: A Toolkit for Improving Language Models in Low-resourced Speech Recognition based on Kaldi.☆22Jul 12, 2019Updated 6 years ago
- An ongoing 24/7 slow-TV art project☆13Updated this week
- Adds color to black and white images.☆26Sep 17, 2025Updated 9 months ago
- Extensions of Leanback Support Library for Android TV.☆25Feb 23, 2018Updated 8 years ago