SamsungLabs / SummaryMixing
This repository implements SummaryMixing, a simpler, faster and much cheaper replacement to self-attention for automatic speech recognition (see: https://arxiv.org/abs/2307.07421). The code is ready to be used with the SpeechBrain toolkit).
☆117Updated 6 months ago
Alternatives and similar repositories for SummaryMixing:
Users that are interested in SummaryMixing are comparing it to the libraries listed below
- ☆84Updated 11 months ago
- LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT☆69Updated 2 years ago
- ConMamba for Automatic Speech Recognition☆62Updated 7 months ago
- Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.☆112Updated last year
- Official code for Wav2Seq☆96Updated 2 years ago
- ☆56Updated 2 years ago
- Example code for a neural transducer model.☆60Updated last year
- ☆77Updated this week
- Transcribing Speech with Multinomial Diffusion, training code and models.☆76Updated last year
- DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning☆46Updated last year
- NOTSOFAR-1 Challenge: Distant Diarization and ASR☆50Updated last month
- A collection of papers related to speech model compression☆23Updated last year
- Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context☆192Updated 6 months ago
- Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities☆114Updated 3 months ago
- Confidence interval computation for evaluation in machine learning using the bootstrapping approach☆77Updated 11 months ago
- Zero-shot Domain-sensitive Speech Recognition with Prompt-conditioning Fine-tuning (ASRU2023)☆27Updated last year
- Implementation of the paper "Self-supervised Learning with Random-projection Quantizer for Speech Recognition" in Pytorch.☆71Updated last year
- Training code and trained checkpoints for ASGAN.☆62Updated last year
- The VoxTube dataset official repository☆68Updated last year
- Clustering-based methods for overlapping diarization☆77Updated last year
- Datasets for turn-taking research☆12Updated last year
- MeetEval - A meeting transcription evaluation toolkit☆89Updated 2 weeks ago
- (Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT☆39Updated 6 months ago
- Feature extractor for DL speech processing.☆65Updated 2 years ago
- Standalone implementation of the CUDA-accelerated WFST Decoder available in Riva☆83Updated last month
- Reference-aware automatic speech evaluation toolkit☆144Updated 3 months ago
- A toolkit to calculate speech audio quality. Not affiliated with the original authors☆50Updated 7 months ago
- Speaker identification/verification models for Machine Learning for Computer Vision class at UNIBO☆62Updated 2 years ago
- Pypi installable TDNN and TDNN-F layers for PyTorch based acoustic model training☆39Updated 4 years ago
- The People’s Speech Dataset☆102Updated last year