SamsungLabs / SummaryMixing
This repository implements SummaryMixing, a simpler, faster and much cheaper replacement to self-attention for automatic speech recognition (see: https://arxiv.org/abs/2307.07421). The code is ready to be used with the SpeechBrain toolkit).
☆111Updated last month
Related projects ⓘ
Alternatives and complementary repositories for SummaryMixing
- ☆84Updated 7 months ago
- ☆56Updated last year
- Example code for a neural transducer model.☆59Updated 9 months ago
- This is the official repository of the papers "Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers" and "Efficient Fi…☆35Updated 3 months ago
- LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT☆69Updated 2 years ago
- Various speech datasets made available to the public☆98Updated last month
- Official code for Wav2Seq☆95Updated 2 years ago
- The VoxTube dataset official repository☆61Updated 8 months ago
- Transcribing Speech with Multinomial Diffusion, training code and models.☆75Updated last year
- Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.☆93Updated last year
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆132Updated 9 months ago
- Standalone implementation of the CUDA-accelerated WFST Decoder available in Riva☆80Updated 2 months ago
- ☆51Updated last week
- Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities☆77Updated 3 months ago
- AudioBench: A Universal Benchmark for Audio Large Language Models☆89Updated last month
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆81Updated last year
- This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.☆103Updated last year
- Zero-shot Domain-sensitive Speech Recognition with Prompt-conditioning Fine-tuning (ASRU2023)☆26Updated last year
- Python wrappers for Kaldi Levenshtein's distance and alignment code.☆61Updated 7 months ago
- Confidence interval computation for evaluation in machine learning using the bootstrapping approach☆66Updated 7 months ago
- CVSS: A Massively Multilingual Speech-to-Speech Translation Corpus☆182Updated 2 years ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆70Updated last year
- Codes and datasets for our ICASSP2023 paper, Evaluating parameter-efficient transfer learning approaches on SURE benchmark for speech und…☆42Updated last year
- This is a list of speech tasks and datasets, which can provide training data for Generative AI, AIGC, AI model training, intelligent spee…☆72Updated 5 months ago
- Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clusterin…☆44Updated last year
- ConMamba for Automatic Speech Recognition☆44Updated 2 months ago
- [NeurIPS'22] Squeezeformer: An Efficient Transformer for Automatic Speech Recognition☆245Updated last year
- asr2k☆48Updated 5 months ago
- Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context☆179Updated 2 months ago