SamsungLabs / SummaryMixing

This repository implements SummaryMixing, a simpler, faster and much cheaper replacement to self-attention for automatic speech recognition (see: https://arxiv.org/abs/2307.07421). The code is ready to be used with the SpeechBrain toolkit).

☆117

Alternatives and similar repositories for SummaryMixing:

Users that are interested in SummaryMixing are comparing it to the libraries listed below

besacier / ASR2022
☆56Updated 2 years ago
mechanicalsea / lighthubert
LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT
☆73Updated 2 years ago
xi-j / Mamba-ASR
ConMamba for Automatic Speech Recognition
☆72Updated 8 months ago
lorenlugosch / transducer-tutorial
Example code for a neural transducer model.
☆61Updated last year
asappresearch / wav2seq
Official code for Wav2Seq
☆96Updated 2 years ago
apple / pytorch-speech-features
☆84Updated last year
hlt-mt / Speech-MASSIVE
Speech-MASSIVE is a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSI…
☆21Updated 8 months ago
RF5 / transfusion-asr
Transcribing Speech with Multinomial Diffusion, training code and models.
☆76Updated last year
kyegomez / USM
Implementation of Google's USM speech model in Pytorch
☆31Updated last month
IDRnD / VoxTube
The VoxTube dataset official repository
☆68Updated last year
RuABraun / texterrors
☆35Updated last week
k2-fsa / libriheavy
Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
☆194Updated 7 months ago
nttcslab / eval-audio-repr
EVAR ~ Evaluation package for Audio Representations
☆53Updated this week
MorenoLaQuatra / ARCH
ARCH: Audio Representations benCHmark
☆44Updated 8 months ago
lucasnewman / best-rq-pytorch
Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.
☆117Updated last year
mlcommons / peoples-speech
The People’s Speech Dataset
☆103Updated last year
huggingface / open_asr_leaderboard
☆92Updated this week
facebookresearch / gtn_applications
Applications using the GTN library and code to reproduce experiments in "Differentiable Weighted Finite-State Transducers"
☆83Updated 2 years ago
k2-fsa / fast_rnnt
A torch implementation of a recursion which turns out to be useful for RNN-T.
☆141Updated last year
Nathan-Roll1 / PSST
Prosodic Speech Segmentation with Transformers
☆25Updated last year
Diamondfan / Child-ASR-Paper
A list of papers for child ASR
☆40Updated 6 months ago
ga642381 / Speech-Prompts-Adapters
This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.
☆108Updated last year
nvidia-riva / riva-asrlib-decoder
Standalone implementation of the CUDA-accelerated WFST Decoder available in Riva
☆87Updated 2 months ago
revdotcom / speech-datasets
Various speech datasets made available to the public
☆116Updated 4 months ago
umbertocappellazzo / PETL_AST
This is the official repository of the papers "Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers" and "Efficient Fi…
☆36Updated 9 months ago
pzelasko / kaldialign
Python wrappers for Kaldi Levenshtein's distance and alignment code.
☆64Updated last year
pyf98 / speech-model-compression
A collection of papers related to speech model compression
☆24Updated last year
Speech-Lab-IITM / CCC-wav2vec-2.0
Code for the method proposed in the paper:- ccc-wav2vec 2.0: Clustering aided Cross-Contrastive learning of Self-Supervised speech repres…
☆21Updated last year
cuhealthybrains / MT-LLM
The implementation for "Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions"
☆39Updated 3 weeks ago
farisalasmary / wav2vec2-kenlm
Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding
☆75Updated 3 years ago