ChunkFormer: Masked Chunking Conformer For Long-Form Speech Transcription
☆77Feb 13, 2026Updated 2 weeks ago
Alternatives and similar repositories for chunkformer
Users that are interested in chunkformer are comparing it to the libraries listed below
Sorting:
- ☆13Updated this week
- ☆11Nov 7, 2024Updated last year
- Grapheme to phoneme converter for Estonian☆14May 27, 2021Updated 4 years ago
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆12Mar 11, 2025Updated 11 months ago
- Official code for SongEcho☆41Feb 21, 2026Updated last week
- official implementation of paper ExPO: Explainable Phonetic Trait-Oriented Network for Speaker Verification☆14Mar 14, 2025Updated 11 months ago
- Hybrid-Anchor Rotation Detector for Oriented Object Detection (ICCV'25-SEA)☆16Aug 11, 2025Updated 6 months ago
- ☆26Jan 23, 2026Updated last month
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆15May 16, 2025Updated 9 months ago
- ☆30Jan 22, 2026Updated last month
- ☆14Jun 12, 2015Updated 10 years ago
- Onnx compatible styletts2 code☆17Jun 8, 2025Updated 8 months ago
- ☆16Nov 9, 2023Updated 2 years ago
- Python Wrapper of Silero VAD☆64May 8, 2025Updated 9 months ago
- ☆139Apr 23, 2025Updated 10 months ago
- Solving Inverse Problems with Diffusion Optimal Control [NeurIPS 2024]☆19Dec 21, 2024Updated last year
- ViStreamASR - Real-Time Vietnamese Speech Recognition☆52Jul 12, 2025Updated 7 months ago
- Python package of MP-SENet from Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement.☆21Nov 1, 2024Updated last year
- Digital Audio Effects in Python (material for MUSI6202@Georgiatech)☆15Nov 30, 2014Updated 11 years ago
- Code for ICLR 2024 Paper: CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models☆22Jul 10, 2024Updated last year
- C++ Implementation of the Information Bottleneck System☆22Jan 9, 2019Updated 7 years ago
- ☆22Jun 30, 2021Updated 4 years ago
- Audio-JEPA is an adaptation of the Joint-Embedding Predictive Architecture (JEPA) for self-supervised audio representation learning. Buil…☆40Jun 17, 2025Updated 8 months ago
- Official codes of the 1st place for The NVIDIA AI City Challenge 2023 - Track 2☆19Jul 25, 2023Updated 2 years ago
- Prosody and Pronunciation Modification Network☆63May 5, 2025Updated 9 months ago
- Python forced alignment☆95Apr 12, 2024Updated last year
- ☆21Dec 23, 2017Updated 8 years ago
- Use python to login and crawl facebook☆22Nov 24, 2023Updated 2 years ago
- Hybrid Flow Matching and GAN with Multi-Resolution Network for Few-Step High-Fidelity Audio Generation☆135Jan 21, 2026Updated last month
- The python implementation for paper "Towards Discriminative Representation Learning for Speech Emotion Recognition" in IJCAI-2019☆23Aug 12, 2019Updated 6 years ago
- Software Engineering Back End Microservices Project☆15Nov 20, 2024Updated last year
- ⚡ Blazing fast audio augmentation in Python, powered by GPU for high-efficiency processing in machine learning and audio analysis tasks.☆35Jan 19, 2024Updated 2 years ago
- Export an ONNX graph that performs ISTFT. Designed for TTS models.☆27Apr 23, 2024Updated last year
- FLM-Audio is a audio-language subversion of RoboEgo/FLM-Ego -- an omnimodal model with native full duplexity.☆62Dec 9, 2025Updated 2 months ago
- Grapheme-to-Phoneme conversion with Joint-Sequence RnnLMs☆31Dec 15, 2014Updated 11 years ago
- ☆27Oct 25, 2024Updated last year
- Python - NSW package for Vietnamese: Normalization system to convert numbers, abbreviations, and words that cannot be pronounced into syl…☆66Jan 1, 2025Updated last year
- A toolset for easy formant extraction and visualization from wav files and TTS models☆33Sep 2, 2022Updated 3 years ago
- (WIP)long form speech generatoins☆31Apr 2, 2025Updated 11 months ago