sdhilip200/speech-to-text

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/sdhilip200/speech-to-text)

sdhilip200 / speech-to-text

Speech to Text with Hugging Face and Wav2vec 2.0

☆35

Alternatives and similar repositories for speech-to-text

Users that are interested in speech-to-text are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

nelson-liu / website
View on GitHub
☆13Feb 5, 2022Updated 4 years ago
zhanglei1949 / federatedSpeechCommands
View on GitHub
Speech recognition with federated learning
☆11Jan 9, 2020Updated 6 years ago
So-Cool / xCave
View on GitHub
Google Earth Pro image extractor and alignment
☆13Feb 9, 2018Updated 8 years ago
logikon-ai / cot-eval
View on GitHub
A framework for evaluating the effectiveness of chain-of-thought reasoning in language models.
☆19Feb 6, 2025Updated last year
nttrd-mdlab / wearable-seld-dataset
View on GitHub
☆10Feb 18, 2022Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
EduardoGarrido90 / ML_books
View on GitHub
This repository will contain links to the most famous available books of ML that are online
☆13Oct 15, 2024Updated last year
GeWanying / shap-anti-spoofing
View on GitHub
This repository includes the code to reproduce our paper [Explainable deepfake and spoofing detection: an attack analysis using SHapley A…
☆12Jan 24, 2024Updated 2 years ago
wali-ku / BWLOCK-GPU
View on GitHub
Protecting Real-Time GPU Kernels on Integrated CPU-GPU SoC Platforms
☆12Apr 9, 2018Updated 8 years ago
thomeou / SALSA-Lite
View on GitHub
This is the public repository for SALSA-Lite features for polyphonic sound event localization and detection using microphone arrays.
☆15Dec 3, 2021Updated 4 years ago
archiki / ASR-Accent-Analysis
View on GitHub
Analysis and investigating the confounding effect of accents in end-to-end Automatic Speech Recognition models.
☆15Jun 27, 2020Updated 6 years ago
BYRTIMO / END-TO-END-SPEECH-ENHANCEMENT-BASED-ON-DISCRETE-COSINE-TRANSFORM
View on GitHub
☆18Nov 10, 2019Updated 6 years ago
rusiaaman / PCPM
View on GitHub
Presenting Collection of Pretrained Models. Links to pretrained models in NLP and voice.
☆23Dec 27, 2019Updated 6 years ago
muhdhuz / Audio_NeuralStyle
View on GitHub
An implementation of Neural Style Transfer for Audio using Pytorch.
☆11Dec 14, 2017Updated 8 years ago
timelfrink / flask-api
View on GitHub
In this repo I show how to simple create an API for your machine learning models in Python
☆12Nov 28, 2018Updated 7 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Open-Speech-EkStep / indic-punct
View on GitHub
☆45Dec 15, 2022Updated 3 years ago
ina-foss / InaGVAD
View on GitHub
Voice activity detection and speaker gender segmentation audiovisual corpus
☆16Jan 20, 2025Updated last year
david-gimeno / tailored-avsr
View on GitHub
Official source code for the paper "Tailored Design of Audio-Visual Speech Recognition Models using Branchformers"
☆14Feb 24, 2025Updated last year
autowarefoundation / autoware_ai_documentation
View on GitHub
☆23Aug 31, 2022Updated 3 years ago
SSTDV-Project / HF-GAN
View on GitHub
☆11Jan 12, 2026Updated 6 months ago
JHU-LCAP / FlexSED
View on GitHub
open-vocabulary sound event detection
☆53Dec 17, 2025Updated 7 months ago
Sreyan88 / RECAP
View on GitHub
Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning
☆16Jun 23, 2024Updated 2 years ago
RicherMans / SpokenLanguageClassifiers
View on GitHub
Pretrained spoken language classifiers from audio.
☆10Jan 21, 2021Updated 5 years ago
tommyscodebase / gemini_chatbot_javascript
View on GitHub
A Javascript Chatbot built with the Gemini AI
☆10Jan 26, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
GoFigure-LANL / VisHash
View on GitHub
Visual Hash for matching copies of visually similar images.
☆16Mar 17, 2025Updated last year
aromanusc / SoundQ
View on GitHub
Enhanced sound event localization and detection in real 360-degree audio-visual soundscapes (DCASE task3 format)
☆14Mar 21, 2025Updated last year
Jack-H-Buckner / UniversalDiffEq.jl
View on GitHub
Universal differential equations for ecologists
☆16Apr 24, 2026Updated 2 months ago
zekarias-tilahun / graph-surgeon
View on GitHub
A PyTorch implementation of "Self-Supervised GNN that Jointly Learns to Augment" or "Jointly Learnable Data Augmentations for Self-Superv…
☆13Dec 13, 2021Updated 4 years ago
AccentDB / code
View on GitHub
Code for AccentDB.
☆24May 28, 2021Updated 5 years ago
Vaibhavs10 / dcase-2023-workshop
View on GitHub
☆14Sep 20, 2023Updated 2 years ago
GATECH-EIC / S3-Router
View on GitHub
[NeurIPS 2022] "Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Spee…
☆17Sep 19, 2023Updated 2 years ago
clam004 / unsupervised-speech-representation-learning
View on GitHub
This is a intuitive explanation of Representation Learning with Contrastive Predictive Coding using code provided by jefflai108 that use…
☆10Jan 25, 2021Updated 5 years ago
bond005 / vad
View on GitHub
Various algorithms for voice activity detection
☆22Jan 31, 2017Updated 9 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
Yifei-ZHAO96 / Tr-VAD
View on GitHub
Tr-VAD: An Efficient Transformer based Voice Activity Detection Model
☆18Aug 1, 2024Updated last year
pc2752 / Multi_Voice_Sing_Speak_Sing
View on GitHub
☆24Mar 24, 2023Updated 3 years ago
DemisEom / RNNT-pytorch
View on GitHub
Implementaion RNN tranceducer
☆23Jun 25, 2019Updated 7 years ago
Arksyd96 / synthesis-with-slice-based-ldm
View on GitHub
Official repository for "3D MRI Synthesis with Slice-Based Latent Diffusion Models: Improving Tumor Segmentation Tasks in Data-Scarce Reg…
☆16Jun 14, 2024Updated 2 years ago
anitaokoh / Medium_Summarizer
View on GitHub
Using Extractive summarization to summarize medium posts
☆11Nov 17, 2019Updated 6 years ago
conradj / pocket-public-archive
View on GitHub
statically generated weekly digest of articles read in Pocket
☆10May 14, 2019Updated 7 years ago
phpstorm1 / SE-FCN
View on GitHub
An implementation of the ICASSP paper 'A fully convolutional neural network for complex spectrogram processing in speech enhancement'
☆20Feb 19, 2020Updated 6 years ago