kaldi cnn-tdnnf baseline
☆13Aug 31, 2021Updated 4 years ago
Alternatives and similar repositories for kaldi-baseline
Users that are interested in kaldi-baseline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Auto-KWS 2021 Challenge 1st place solution.☆11Jul 20, 2021Updated 4 years ago
- SChunk-Encoder (Transformer or Conformer) for streaming E2E ASR☆11Oct 21, 2022Updated 3 years ago
- Framework for Detection Evaluation (F4DE) : set of evaluation tools for detection evaluations and for specific NIST-coordinated evaluatio…☆25Jul 6, 2017Updated 8 years ago
- End-to-End Keyword Spotting (E2E-KWS) using a character level LSTM☆44Nov 18, 2022Updated 3 years ago
- E2E system with LF-MMI; word N-gram for Mandarin☆167Apr 29, 2022Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model☆13Mar 30, 2025Updated last year
- Target speaker automatic speech recognition (TS-ASR)☆13Oct 14, 2023Updated 2 years ago
- Getting confidences from any end-to-end systems☆11May 24, 2023Updated 2 years ago
- ☆15Aug 25, 2022Updated 3 years ago
- magicspeech competition recipe☆18Jun 29, 2020Updated 5 years ago
- The case study and multilingfual performance of ICASSP submission☆24Sep 24, 2022Updated 3 years ago
- 3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition☆118Jun 22, 2022Updated 3 years ago
- Keyword Search Recipe for Subword ASR☆30Jul 12, 2019Updated 6 years ago
- Python implementation of CTC beam search decoder + agnostic LM scorer☆20Dec 16, 2020Updated 5 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆29Jun 15, 2022Updated 3 years ago
- Code for the paper "FastAdaSP: An Efficient Multitask Inference Framework for Large Speech Language Models". @ EMNLP'24(Oral)☆17Nov 14, 2024Updated last year
- ☆15Jul 4, 2024Updated last year
- ☆18Jul 22, 2024Updated last year
- [ASRU 2021] Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition☆219Jun 22, 2023Updated 2 years ago
- C++ PyTorch Examples☆10Aug 18, 2019Updated 6 years ago
- Light-weight transfer learning framework for on-device speech and audio recognition using pre-trained image convolutional neural networks…☆18Apr 16, 2022Updated 4 years ago
- MicRank is a Learning to Rank neural channel selection framework where a DNN is trained to rank microphone channels.☆22Apr 8, 2021Updated 5 years ago
- This is a public repository for RATS Channel-A Speech Data, which is a chargeable noisy speech dataset under LDC. Here we release its Log…☆16Oct 22, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Automatic Speech Recognition (ASR) system for the Samrómur speech corpus using Kaldi☆12Sep 30, 2022Updated 3 years ago
- A wrapper for Audeering's wav2vec-based dimensional speech emotion recognition☆22Aug 9, 2023Updated 2 years ago
- ☆18Mar 18, 2020Updated 6 years ago
- ☆12Jun 5, 2018Updated 7 years ago
- ☆277Jan 15, 2021Updated 5 years ago
- ☆13Mar 25, 2021Updated 5 years ago
- Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding☆75Oct 11, 2021Updated 4 years ago
- MusicYOLO framework uses the object detection model, YOLOx, to locate notes in the spectrogram.☆11Jan 29, 2022Updated 4 years ago
- Korean read speech corpus (about 120 hours, 17GB) from National Institute of Korean Language☆43Feb 28, 2018Updated 8 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Eureka-Audio: A 1.7B lightweight audio–language model that matches 7B–30B models on ASR, audio understanding, and paralinguistic reasonin…☆40Apr 11, 2026Updated 3 weeks ago
- CN-Celeb, a large-scale Chinese celebrities dataset published by Center for Speech and Language Technology (CSLT) at Tsinghua University.☆79Nov 9, 2019Updated 6 years ago
- Scripts for data generation, scoring and data manifest preparation for CHiME-8 DASR task.☆24Feb 25, 2025Updated last year
- ☆23Jun 24, 2024Updated last year
- ☆11Dec 28, 2023Updated 2 years ago
- Script to perform statistical significance test between ASR hypotheses.☆23Aug 13, 2017Updated 8 years ago
- This repository contains the code for our upcoming paper An Investigation of End-to-End Models for Robust Speech Recognition at ICASSP 20…☆49Dec 25, 2024Updated last year