SamsungLabs / SummaryMixing
This repository implements SummaryMixing, a simpler, faster and much cheaper replacement to self-attention for automatic speech recognition (see: https://arxiv.org/abs/2307.07421). The code is ready to be used with the SpeechBrain toolkit).
☆117Updated 7 months ago
Alternatives and similar repositories for SummaryMixing:
Users that are interested in SummaryMixing are comparing it to the libraries listed below
- ☆56Updated 2 years ago
- LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT☆73Updated 2 years ago
- ConMamba for Automatic Speech Recognition☆72Updated 8 months ago
- Example code for a neural transducer model.☆61Updated last year
- Official code for Wav2Seq☆96Updated 2 years ago
- ☆84Updated last year
- Speech-MASSIVE is a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSI…☆21Updated 8 months ago
- Transcribing Speech with Multinomial Diffusion, training code and models.☆76Updated last year
- Implementation of Google's USM speech model in Pytorch☆31Updated last month
- The VoxTube dataset official repository☆68Updated last year
- ☆35Updated last week
- Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context☆194Updated 7 months ago
- EVAR ~ Evaluation package for Audio Representations☆53Updated this week
- ARCH: Audio Representations benCHmark☆44Updated 8 months ago
- Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.☆117Updated last year
- The People’s Speech Dataset☆103Updated last year
- ☆92Updated this week
- Applications using the GTN library and code to reproduce experiments in "Differentiable Weighted Finite-State Transducers"☆83Updated 2 years ago
- A torch implementation of a recursion which turns out to be useful for RNN-T.☆141Updated last year
- Prosodic Speech Segmentation with Transformers☆25Updated last year
- A list of papers for child ASR☆40Updated 6 months ago
- This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.☆108Updated last year
- Standalone implementation of the CUDA-accelerated WFST Decoder available in Riva☆87Updated 2 months ago
- Various speech datasets made available to the public☆116Updated 4 months ago
- This is the official repository of the papers "Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers" and "Efficient Fi…☆36Updated 9 months ago
- Python wrappers for Kaldi Levenshtein's distance and alignment code.☆64Updated last year
- A collection of papers related to speech model compression☆24Updated last year
- Code for the method proposed in the paper:- ccc-wav2vec 2.0: Clustering aided Cross-Contrastive learning of Self-Supervised speech repres…☆21Updated last year
- The implementation for "Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions"☆39Updated 3 weeks ago
- Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding☆75Updated 3 years ago