Layer-wise analysis of self-supervised pre-trained speech representations
☆133Oct 18, 2024Updated last year
Alternatives and similar repositories for layerwise-analysis
Users that are interested in layerwise-analysis are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Word Discovery in Visually Grounded, Self-Supervised Speech Models☆27Dec 4, 2023Updated 2 years ago
- ☆31Jul 13, 2023Updated 2 years ago
- Official implementation of MelHuBERT☆70Feb 21, 2026Updated 3 months ago
- ☆13Sep 25, 2024Updated last year
- (Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT☆41Aug 29, 2024Updated last year
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- **Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…☆102Apr 10, 2025Updated last year
- Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" acc…☆77Jul 16, 2023Updated 2 years ago
- Syllable Segmentation and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model☆35Aug 27, 2023Updated 2 years ago
- Self-Supervised Speech Pre-training and Representation Learning Toolkit☆2,554Mar 12, 2026Updated 2 months ago
- [INTERSPEECH 2025 Oral]Official code for "Accelerating Diffusion-based Text-to-Speech Model Training with Dual Modality Alignment"☆66Jun 16, 2025Updated 11 months ago
- Collection of scripts from mHuBERT-147.☆35Nov 19, 2024Updated last year
- ASR text preprocessing utility☆21Aug 5, 2024Updated last year
- ☆19Apr 28, 2023Updated 3 years ago
- Transformer-based visually grounded speech models☆19Sep 22, 2022Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Public Code for the paper MAE-AST: Masked Autoencoding Audio Spectrogram Transformer☆93Jun 9, 2022Updated 3 years ago
- [NeurIPS 2022] "Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Spee…☆17Sep 19, 2023Updated 2 years ago
- The official repository of SpeechCraft dataset, a large-scale expressive bilingual speech dataset with natural language descriptions.☆190Feb 28, 2026Updated 2 months ago
- Implementation of SoundStorm built upon SpeechTokenizer.☆116Nov 2, 2023Updated 2 years ago
- Sylber: Syllabic Embedding Representation of Speech from Raw Audio☆79Mar 17, 2025Updated last year
- Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.☆22May 26, 2025Updated 11 months ago
- LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT☆74Sep 26, 2022Updated 3 years ago
- PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-To-Speech Using Natural Language Descriptions☆86Oct 11, 2024Updated last year
- ☆46Feb 16, 2023Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- S3PRL-VC: A Voice Conversion Toolkit based on S3PRL☆101Mar 15, 2026Updated 2 months ago
- ☆15Nov 11, 2024Updated last year
- This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples a…☆658Jun 9, 2024Updated last year
- (SLT 2024) Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition☆13Oct 22, 2024Updated last year
- Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context☆216Sep 10, 2024Updated last year
- speaker-disentangled speech linguistic content quantizer☆25Mar 19, 2025Updated last year
- A toolkit to calculate speech audio quality. Not affiliated with the original authors☆72Aug 13, 2024Updated last year
- A PyTorch implementation of the universal neural vocoder☆68Nov 6, 2020Updated 5 years ago
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆117Jan 28, 2026Updated 3 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Unsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.☆39May 5, 2026Updated 2 weeks ago
- AudioCodec-Hub is a Python library for encoding and decoding audio data, supporting various neural audio codec models☆25Sep 26, 2023Updated 2 years ago
- Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023☆12May 13, 2024Updated 2 years ago
- [AAAI 2024] Code for CTX-vec2wav in UniCATS☆130Jun 11, 2024Updated last year
- ☆41May 15, 2023Updated 3 years ago
- This is the code for paper: XY-Tokenizer: Mitigating the Semantic-Acoustic Conflict in Low-Bitrate Speech Codecs☆92Sep 19, 2025Updated 8 months ago
- The official repository of Dynamic-SUPERB.☆200Jun 24, 2025Updated 10 months ago