khanld/chunkformer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/khanld/chunkformer)

khanld / chunkformer

ChunkFormer: Masked Chunking Conformer For Long-Form Speech Transcription

☆82

Alternatives and similar repositories for chunkformer

Users that are interested in chunkformer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

nguyenvulebinh / ViStreamASR
View on GitHub
ViStreamASR - Real-Time Vietnamese Speech Recognition
☆61Jul 12, 2025Updated last year
ducnt18121997 / Viet-Text-Normalization
View on GitHub
A Python library for text normalization, specifically designed for Vietnamese and English text processing. This library provides comprehe…
☆14Mar 30, 2025Updated last year
nguyenthienhy / F5-TTS-Vietnamese
View on GitHub
☆161Apr 23, 2025Updated last year
nguyentrungnghia1998 / Reinforcement-Learning-for-Optimal-Feedback-Control-Simulation
View on GitHub
☆18Mar 3, 2023Updated 3 years ago
v-nhandt21 / Vinorm
View on GitHub
Python - NSW package for Vietnamese: Normalization system to convert numbers, abbreviations, and words that cannot be pronounced into syl…
☆68Jan 1, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
MysticShadow427 / simplistic-zipformer
View on GitHub
Simplistic Implementation of Zipformer:A faster and better encoder for automatic speech recognition in PyTorch
☆22Jun 3, 2024Updated 2 years ago
zzasdf / VietASR
View on GitHub
☆52Sep 3, 2025Updated 10 months ago
NhutP / VietSpeech
View on GitHub
☆13Apr 25, 2025Updated last year
jakariaemon / WSI
View on GitHub
Whisper Speaker Identification (WSI), a cutting-edge model for multilingual speaker identification.
☆26Jun 29, 2026Updated 3 weeks ago
Miamoto / Conformer-NTM
View on GitHub
☆16Nov 9, 2023Updated 2 years ago
Fsoft-AIC / Lightweight-Language-driven-Grasp-Detection
View on GitHub
[IROS 2024] Lightweight Language-driven Grasp Detection using Conditional Consisitency Model
☆31Aug 14, 2024Updated last year
uthree / ddsp-vocoder
View on GitHub
☆12Nov 7, 2024Updated last year
chuoibo / VocalMind
View on GitHub
End to End Speech to Speech with Emotion System
☆15Feb 6, 2025Updated last year
vnk8071 / ZAIC2022-Lyric-Alignment
View on GitHub
Top 9 private leaderboard & Top 17 public leaderboard
☆10Dec 1, 2022Updated 3 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
Wasser1462 / Qwen3-ASR-onnx
View on GitHub
A small and simple example showing how to run Qwen3-ASR with ONNX Runtime.
☆33Apr 8, 2026Updated 3 months ago
ZaloAI-Jaist / VMLU
View on GitHub
☆82May 4, 2024Updated 2 years ago
dangtr0408 / StyleTTS2-lite
View on GitHub
A lightweight, efficient variation of the StyleTTS 2 text‐to‐speech model.
☆50May 22, 2025Updated last year
samsad35 / code-ancogen
View on GitHub
[ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder
☆14Mar 11, 2025Updated last year
tuanh123789 / Spark-TTS-finetune
View on GitHub
finetune llm part for spark-tts model
☆126Mar 25, 2025Updated last year
k2-fsa / ZipVoice
View on GitHub
Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching
☆1,018Dec 2, 2025Updated 7 months ago
patriotyk / styletts2-inference
View on GitHub
Onnx compatible styletts2 code
☆16Apr 4, 2026Updated 3 months ago
ducnh279 / LLMs-Pretraining-with-PyTorch
View on GitHub
Code example for pretraining an LLM with vanilla PyTorch training loop
☆10Jun 6, 2024Updated 2 years ago
manhdh32 / 1st_kalapa_ocr
View on GitHub
☆11Jan 1, 2024Updated 2 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
YangXusheng-yxs / CodecFormer_5Hz
View on GitHub
☆35Oct 23, 2025Updated 9 months ago
mmmmayi / ExPO
View on GitHub
official implementation of paper ExPO: Explainable Phonetic Trait-Oriented Network for Speaker Verification
☆14Mar 14, 2025Updated last year
Vietnam-Celeb / Vietnam-Celeb
View on GitHub
☆12Mar 9, 2023Updated 3 years ago
T-Sunm / rag-ops
View on GitHub
This project applies the core knowledge from the LLMOps module, including the design and implementation of the API Layer, Inference Layer…
☆74Dec 27, 2025Updated 6 months ago
nguyenvulebinh / spoken-norm
View on GitHub
Transformation spoken text to written text
☆31May 14, 2024Updated 2 years ago
facebookresearch / FlowDec
View on GitHub
An neural full-band audio codec for general audio sampled at 48 kHz with 7.5 kps or 4.5 kbps.
☆212Jun 22, 2026Updated last month
viettmab / SA-DPM
View on GitHub
☆16Jan 28, 2024Updated 2 years ago
zef1611 / AIC23_NLRetrieval_HCMIU_CVIP
View on GitHub
Official codes of the 1st place for The NVIDIA AI City Challenge 2023 - Track 2
☆20Jul 25, 2023Updated 3 years ago
juhayna-zh / BSRNN-speech-preprocess
View on GitHub
A solution to denoising and separating for two-speaker-mixed noisy speech, using a BSRNN inspired network.
☆15Aug 22, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
5Hyeons / StyleTTS2-Vocos
View on GitHub
StyleTTS2 + Vocos as a Decoder
☆13Mar 24, 2025Updated last year
longphungtuan94 / Vietnamese-Text-to-Speech
View on GitHub
⏩ Generating speech in a single forward pass without any attention!
☆13Jul 22, 2021Updated 5 years ago
ictnlp / FastLongSpeech
View on GitHub
FastLongSpeech is a novel framework designed to extend the capabilities of Large Speech-Language Models for efficient long-speech process…
☆16Jul 22, 2025Updated last year
PhucNDA / HA-RDet
View on GitHub
Hybrid-Anchor Rotation Detector for Oriented Object Detection (ICCV'25)
☆18Aug 11, 2025Updated 11 months ago
andvg3 / LSDM
View on GitHub
Dataset and Code for NeurIPS 2023 paper "Language-driven Scene Synthesis using Multi-conditional Diffusion Model."
☆48Aug 8, 2024Updated last year
OptimusPrimus / tacos
View on GitHub
Temporally-aligned Audio CaptiOnS for Language-Audio Pretraining
☆16Oct 12, 2025Updated 9 months ago
v-nhandt21 / Viphoneme
View on GitHub
Vi_G2P or ViG2P: G2P package for Vietnamese: based on vPhon and phonology knowledge to convert Raw text - Graphoneme to IPA
☆109Jun 21, 2024Updated 2 years ago