gongouveia/Whisper-Synthetic-ASR-Dataset-Generator

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/gongouveia/Whisper-Synthetic-ASR-Dataset-Generator)

gongouveia / Whisper-Synthetic-ASR-Dataset-Generator

This UI serves as a Synthetic ASR Dataset Generator powered by/for OpenAI Whisper, enabling users to capture audio, transcribing it, on the fly and manage the generated dataset 🤗. Fine tune Whisper or enhanced and custom datasets

☆34

Alternatives and similar repositories for Whisper-Synthetic-ASR-Dataset-Generator

Users that are interested in Whisper-Synthetic-ASR-Dataset-Generator are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

bbc / dialogger
View on GitHub
Text-based media editing interface
☆16Aug 9, 2017Updated 8 years ago
winstxnhdw / CapGen
View on GitHub
A fast CPU-first video/audio transcriber for generating caption files with Whisper and CTranslate2, hosted on Hugging Face Spaces.
☆11Updated this week
doppio / word2num
View on GitHub
A Python package for converting numbers expressed in natural language to numerical values.
☆13Nov 25, 2023Updated 2 years ago
KimberleyJensen / kmdx-net_music-source-separation
View on GitHub
☆34May 15, 2023Updated 3 years ago
iamhectorotero / generative-audio-inpainting
View on GitHub
Can Neural Networks reconstruct missing audio data? What about GANs?
☆18Nov 6, 2019Updated 6 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
devbret / detailed-audio-analysis
View on GitHub
Analyze and visualize how rhythm, timbre, loudness, pitch, spectral characteristics and other key audio features evolve over time across …
☆11Updated this week
TeamAudio / reaspeech-lite
View on GitHub
Speech-to-text transcription VST3/ARA plugin
☆62Jun 8, 2026Updated last month
jerrykrinock / UnixDomainSocketsDemo
View on GitHub
Demo how to use use Unix Domain Sockets in Swift on macOS.
☆13Sep 24, 2021Updated 4 years ago
pamparamm / ComfyUI-vectorscope-cc
View on GitHub
ComfyUI port of SDWebUI Vectorscope CC and Diffusion CG extensions
☆21Feb 24, 2025Updated last year
nkchocoai / ComfyUI-TextOnSegs
View on GitHub
Custom node for ComfyUI. Add a node for drawing text to the area of SEGS.
☆14Mar 30, 2025Updated last year
ina-foss / InaGVAD
View on GitHub
Voice activity detection and speaker gender segmentation audiovisual corpus
☆16Jan 20, 2025Updated last year
Vrushank264 / Human-Parsing-PyTorch
View on GitHub
Human body part segmentation model, trained with 22 class labels.
☆17Sep 28, 2023Updated 2 years ago
ForeverPs / AODNet-Based-Image-Haze-Removal
View on GitHub
Single Image Haze Removal Using AODNet in Pytorch
☆15Mar 5, 2021Updated 5 years ago
DIVISIO-AI / whisper-java
View on GitHub
A Java port of whisper 3, based on the huggingface version, using DJL.
☆17Apr 3, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Hiroshiba / openjtalk-label-getter
View on GitHub
☆10Dec 10, 2021Updated 4 years ago
xavierfav / feature-comparison-clustering
View on GitHub
Comparing Audio Features for Unsupervised Sound Classification
☆10Jun 22, 2022Updated 4 years ago
taeyoun811 / Whisfusion
View on GitHub
Whisfusion: Parallel ASR Decoding via a Diffusion Transformer
☆31Aug 22, 2025Updated 11 months ago
egorsmkv / whisper-ukrainian
View on GitHub
Trainer and Evaluation scripts for fine-tuning Whisper models for the Ukrainian language
☆23Jan 13, 2023Updated 3 years ago
manymuch / Natural-Noise-Generator
View on GitHub
☆10Aug 3, 2019Updated 6 years ago
rhasspy / wav2mel
View on GitHub
Transform audio files into mel spectrograms for text-to-speech model training
☆12Aug 25, 2021Updated 4 years ago
GPUPhobia / vocal-mask
View on GitHub
☆12May 1, 2019Updated 7 years ago
david-gimeno / tailored-avsr
View on GitHub
Official source code for the paper "Tailored Design of Audio-Visual Speech Recognition Models using Branchformers"
☆15Feb 24, 2025Updated last year
muhdhuz / audio2spec
View on GitHub
Scripts to convert audio files to spectrograms and back
☆12Nov 23, 2017Updated 8 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
devanshbatham / getsan
View on GitHub
A utility to fetch and display dns names from the SSL/TLS cert data
☆15Aug 11, 2023Updated 2 years ago
zirui-ray-liu / DivAug
View on GitHub
☆13Aug 25, 2021Updated 4 years ago
zanvari / resnet50-quantization
View on GitHub
Resnet50 Quantization for Inference Speedup in PyTorch
☆23Jan 30, 2021Updated 5 years ago
dynilib / dynitag
View on GitHub
Collaborative audio annotation tool
☆17Sep 16, 2022Updated 3 years ago
unreal79 / pic2wav
View on GitHub
Encode an image to sound (WAV file) and view it as a spectrogram. Optimized Python 3 version.
☆11Jan 25, 2023Updated 3 years ago
d3n7 / riffusionPrepper
View on GitHub
Prepare spectrograms from audio for training a Riffusion model
☆16Mar 6, 2023Updated 3 years ago
tarepan / rainbowgram
View on GitHub
Rainbowgram with Python
☆13Jan 28, 2019Updated 7 years ago
sagiebenaim / Singing
View on GitHub
☆19May 9, 2019Updated 7 years ago
pje / sid
View on GitHub
Arduino/AVR C code for controlling the MOS6581 SID sound chip over MIDI
☆11Oct 14, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
hylarucoder / comfyui-copilot
View on GitHub
☆28Jun 28, 2024Updated 2 years ago
adrianbarahona / conditional_wavegan_knocking_sounds
View on GitHub
Keras implementation of conditional waveGAN. Application to knocking sound effects with emotion.
☆10Jun 22, 2020Updated 6 years ago
ryoasu / grad-cam
View on GitHub
Grad-CAM (Gradient-weighted Class Activation Mapping)
☆13Dec 20, 2019Updated 6 years ago
shareef12 / img2wav
View on GitHub
Convert images to audio for display in a spectrogram
☆13Apr 17, 2018Updated 8 years ago
marph91 / pocket-cnn
View on GitHub
CNN-to-FPGA-framework for small CNN, written in VHDL and Python
☆24Jun 8, 2021Updated 5 years ago
lionelmessi6410 / tensorflow2-cifar
View on GitHub
95.76% on CIFAR-10 with TensorFlow2
☆32Oct 21, 2021Updated 4 years ago
electronoora / joyemu
View on GitHub
RPi program to use Bluetooth and/or USB gamepads and mice on retro 8/16-bit computers (C64, Amiga, etc)
☆15Dec 11, 2020Updated 5 years ago