Ichigo Whisper is a compact (22M parameters), open-source speech tokenizer for the Whisper-medium, designed to enhance performance on multilingual with minimal impact on its original English capabilities.
☆17Jan 20, 2025Updated last year
Alternatives and similar repositories for WhisperSpeech
Users that are interested in WhisperSpeech are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆21Mar 25, 2025Updated last year
- Code release for "TinySpeech: Attention Condensers for Deep Speech Recognition Neural Networks on Edge Devices"☆23Jun 7, 2025Updated last year
- A High-Quality and Large-Scale Dataset for English-Vietnamese Speech Translation (INTERSPEECH 2022)☆25Jun 5, 2025Updated last year
- ☆21Jun 12, 2025Updated last year
- App to search images with Unsplash's API and react-query 🔋☆10Oct 7, 2022Updated 3 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- A lightweight, efficient variation of the StyleTTS 2 text‐to‐speech model.☆50May 22, 2025Updated last year
- Vistral-V: Visual Instruction Tuning for Vistral - Vietnamese Large Vision-Language Model.☆23Jul 1, 2024Updated last year
- zero shot NER fine tuning☆14Mar 17, 2025Updated last year
- A Deep-learning project utilizing 3D human pose estimation to compare different poses☆13Feb 25, 2024Updated 2 years ago
- Instruction Fine-Tuning of Meta Llama 3.2-3B Instruct on Kannada Conversations. Tailoring the model to follow specific instructions in Ka…☆27Jan 30, 2025Updated last year
- HiFi-SR is a Python-based pipeline for the detection of plant mitochondrial structural rearrangements based on the mapping of PacBio high…☆10Updated this week
- The OpenAI Whisper speech-to-text model as a simple HTTP server☆14Oct 26, 2023Updated 2 years ago
- Bayesian histograms for estimation of binary rare event rates, with fully automated bin pruning☆29Oct 14, 2021Updated 4 years ago
- Code and datasets for the salesforce AI research paper on prompt leakage and multi-turn threats against LLMs☆22Jun 2, 2026Updated 2 weeks ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆16Oct 13, 2024Updated last year
- Summary of all repositories for my public contents, mostly Python, in Jupyter Notebooks, PDFs, Markdowns, and more!☆11Aug 24, 2021Updated 4 years ago
- ☆12Nov 1, 2023Updated 2 years ago
- Large Language Models (LLMs) Learning Resources☆21Jun 16, 2024Updated 2 years ago
- Trends, Tools, News timeline ...☆20Oct 13, 2025Updated 8 months ago
- Telegram bot to help you with your findings 🚀☆23Jul 11, 2024Updated last year
- A public repo that contains integrations for Argilla and LlamaIndex.☆17Oct 10, 2024Updated last year
- LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM☆18May 17, 2024Updated 2 years ago
- [APSIPA'22] Exploring Speaker Age Estimation on Different Self-Supervised Learning Models☆14Oct 19, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- The demo page for ALMTokenizer☆59Apr 14, 2025Updated last year
- End-to-end Multi-task Solutions for Aspect Category Sentiment Analysis (ACSA) on Vietnamese reviews, using PhoBERT as pretrained model☆34Jul 9, 2024Updated last year
- openvino version of openai/whisper☆15Oct 8, 2024Updated last year
- Visualization tools for audio-only and multi-modal speaker diarization dataset☆13Oct 27, 2023Updated 2 years ago
- ☆14Aug 19, 2024Updated last year
- This repo is for measuring the heart rate and respiration rate using the webcam. Working on SpO2 oxygen level and will try for blood pres…☆14Jul 12, 2022Updated 3 years ago
- Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch☆183Jun 20, 2025Updated 11 months ago
- ☆65Nov 24, 2024Updated last year
- Low-latency ASR using SpeechBrain StreamingASR and torchaudio StreamReader.☆18Apr 19, 2025Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Repository for the paper "ViHateT5: Enhancing Hate Speech Detection in Vietnamese with A Unified Text-to-Text Transformer Model" (ACL'202…☆12Aug 13, 2024Updated last year
- BookWorm: A Dataset for Character Description and Analysis [EMNLP Findings 2024]☆14Feb 28, 2025Updated last year
- ☆58Feb 8, 2026Updated 4 months ago
- InSales e-commerce platform API bindings☆14Jul 13, 2024Updated last year
- Whisper-Flamingo [Interspeech 2024] and mWhisper-Flamingo [IEEE SPL 2025] for Audio-Visual Speech Recognition and Translation☆209Jul 29, 2025Updated 10 months ago
- Interaction-Focused Anomaly Detection on Bipartite Node-and-Edge-Attributed Graphs☆16Aug 7, 2023Updated 2 years ago
- StyleTTS2 + Vocos as a Decoder☆13Mar 24, 2025Updated last year