Speech to Text with Hugging Face and Wav2vec 2.0
☆35Feb 13, 2021Updated 5 years ago
Alternatives and similar repositories for speech-to-text
Users that are interested in speech-to-text are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Creating super-parallel corpora of more than 1500+ unique languages for NLP research☆34Dec 8, 2022Updated 3 years ago
- This is my Official and newest website portfolio with the source code if you want to try my design☆18Dec 13, 2023Updated 2 years ago
- A framework for evaluating the effectiveness of chain-of-thought reasoning in language models.☆19Feb 6, 2025Updated last year
- a simplified version of wav2vec(1.0, vq, 2.0) in fairseq☆171Sep 21, 2020Updated 5 years ago
- ☆30Dec 30, 2025Updated 3 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆13Feb 5, 2022Updated 4 years ago
- This repository will contain links to the most famous available books of ML that are online☆13Oct 15, 2024Updated last year
- ☆18Nov 10, 2019Updated 6 years ago
- A simple Python script to convert FOA audio to binaural.☆15Nov 29, 2022Updated 3 years ago
- Presenting Collection of Pretrained Models. Links to pretrained models in NLP and voice.☆23Dec 27, 2019Updated 6 years ago
- In this repo I show how to simple create an API for your machine learning models in Python☆12Nov 28, 2018Updated 7 years ago
- An implementation of Neural Style Transfer for Audio using Pytorch.☆11Dec 14, 2017Updated 8 years ago
- Pytorch implementation of "A Deep Reinforced Model for Abstractive Summarization"(https://arxiv.org/abs/1705.04304)☆18Aug 15, 2017Updated 8 years ago
- My system for the DCASE 2022 Task 3 Sound Event Localizaiton and Detection.☆12Nov 12, 2022Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A Javascript Chatbot built with the Gemini AI☆10Jan 26, 2024Updated 2 years ago
- Github repository for inzva-ai project Audio Style Transfer☆56Oct 13, 2018Updated 7 years ago
- Data generator for stereo sound event localization and detection task of DCASE 2025 challenge☆16Jul 17, 2025Updated 9 months ago
- ☆45Dec 15, 2022Updated 3 years ago
- Official source code for the paper "Tailored Design of Audio-Visual Speech Recognition Models using Branchformers"☆14Feb 24, 2025Updated last year
- Streamlit Dashboard over Superstore Data stored in Postgres Docker container. With SQLAlchemy + Plotly Express☆12Oct 16, 2024Updated last year
- Visual Hash for matching copies of visually similar images.☆16Mar 17, 2025Updated last year
- Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning☆15Jun 23, 2024Updated last year
- Official Repo for the Paper "AI as Humanity's Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution o…☆26Jan 12, 2025Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆14Sep 20, 2023Updated 2 years ago
- [NeurIPS 2022] "Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Spee…☆17Sep 19, 2023Updated 2 years ago
- This is a intuitive explanation of Representation Learning with Contrastive Predictive Coding using code provided by jefflai108 that use…☆10Jan 25, 2021Updated 5 years ago
- Tutorial for Brats 2024 BraSyn (Missing Modality Synthesis) Challenge☆20Sep 30, 2024Updated last year
- My personal site. Contains my blog and other useful sections☆14Apr 22, 2026Updated last week
- DUSTED: Spoken-Term Discovery using Discrete Speech Units☆18Oct 2, 2024Updated last year
- Python Turtle - A selection of geometric patterns☆12Apr 13, 2018Updated 8 years ago
- [WACV'18] Where and Who? Automatic Semantic-Aware Person Composition☆14Apr 15, 2022Updated 4 years ago
- Official repository for "3D MRI Synthesis with Slice-Based Latent Diffusion Models: Improving Tumor Segmentation Tasks in Data-Scarce Reg…☆16Jun 14, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Pytorch implementation of MICCAI-2022 paper, Domain-adaptive 3D Medical Image Synthesis: An Efficient Unsupervised Approach https://arxiv…☆22Jul 5, 2022Updated 3 years ago
- Demo project using Astro.js and Filament as a headless CMS☆14Aug 29, 2024Updated last year
- Unveiling the Knowledge of Hindu Scriptures☆13Mar 31, 2025Updated last year
- Welcome to the Real-Time Voice Activity Detection (VAD) program, powered by Silero-VAD model! 🚀 This program allows you to perform live …☆12Jul 9, 2023Updated 2 years ago
- statically generated weekly digest of articles read in Pocket☆10May 14, 2019Updated 6 years ago
- Speech Recognition for speakers with speech disorders due to diseases like Cerebral Palsy, Parkinson or Amyotrophic Lateral Sclerosis ALS…☆23Mar 26, 2017Updated 9 years ago
- Official code for our paper "Reasoning Models Hallucinate More: Factuality-Aware Reinforcement Learning for Large Reasoning Models"☆24Oct 31, 2025Updated 5 months ago