[ICASSP 2023] Tempo vs. Pitch: understanding self-supervised tempo estimation
☆13Aug 2, 2023Updated 2 years ago
Alternatives and similar repositories for steme
Users that are interested in steme are comparing it to the libraries listed below
Sorting:
- Official Repository of Six Dragons Fly Again (ISMIR 2024)☆13Nov 13, 2025Updated 3 months ago
- MusAV: a dataset of relative arousal-valence annotations for validation of audio models☆17Dec 16, 2022Updated 3 years ago
- wake-up word emotion recognition [APSIPA 2022]☆17Nov 11, 2022Updated 3 years ago
- This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.☆20Nov 19, 2024Updated last year
- A MIDI-based Real-time Pianoroll Note Visualization Library in JavaScript☆20Aug 1, 2023Updated 2 years ago
- Do Music Generation Models Encode Music Theory? (ISMIR 2024)☆22Oct 30, 2024Updated last year
- This is the code of the ICASSP 2020 paper "Joint phoneme alignment and text-informed speech separation on highly corrupted speech"☆15Apr 8, 2024Updated last year
- Simple baseline model for the HEAR benchmark☆23Feb 17, 2026Updated 2 weeks ago
- Korean text data preprocess toolkit for NLP☆18Jun 11, 2019Updated 6 years ago
- Code for the paper "Toward Fully Self-Supervised Multi-Pitch Estimation".☆23Sep 27, 2025Updated 5 months ago
- Unofficial Implementation of MLP-Mixer in TensorFlow☆27May 6, 2021Updated 4 years ago
- Pre-training, fine-tuning, and inference code with the MAEST models for music analysis applications.☆54Jun 27, 2025Updated 8 months ago
- CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language Models [NAACL 2025]☆60Feb 28, 2025Updated last year
- Official implementation of "Equivariant Self-Supervision for Musical Tempo Estimation (ISMIR 2022)"☆26Feb 6, 2023Updated 3 years ago
- ☆32Jan 6, 2022Updated 4 years ago
- A piano music dataset with Audio, Symbolic and Text labels☆34Mar 6, 2025Updated last year
- PyTorch Dataset for Speech and Music audio☆80Jul 12, 2024Updated last year
- Cantonese Text to Speech with VITS implementation☆37Apr 8, 2023Updated 2 years ago
- ☆38Jun 16, 2024Updated last year
- A standardized toolkit of Kernel Audio Distance (KAD)—a distribution-free, unbiased, and computationally efficient metric for evaluating …☆95Jun 12, 2025Updated 8 months ago
- OCTRA is a web-application for the orthographic transcription of audio files.☆39Feb 17, 2026Updated 2 weeks ago
- Code for the Interspeech 2024 paper "MM-KWS: Multi-modal Prompts for Multilingual User-defined Keyword Spotting"☆45Jan 24, 2026Updated last month
- MuChoMusic is a benchmark for evaluating music understanding in multimodal audio-language models.☆44Dec 3, 2024Updated last year
- [ICLR 2022] "Audio Lottery: Speech Recognition Made Ultra-Lightweight, Noise-Robust, and Transferable", by Shaojin Ding, Tianlong Chen, Z…☆32Apr 8, 2022Updated 3 years ago
- This code is to run the WARP-Q speech quality metric.☆34Oct 15, 2024Updated last year
- Compute distribution-based quality metrics for audio data using embeddings, with a focus on music.☆43Jan 15, 2026Updated last month
- Official Repository of Paper: "Towards High-Quality Zero-Shot Singing Voice Conversion in Low-Resource Scenarios"(AAAI 2026)☆89Jan 31, 2026Updated last month
- Official Implementation of Jointist☆37Jul 26, 2023Updated 2 years ago
- PyTorch implementation of DiffRoll, a diffusion-based generative automatic music transcription (AMT) model☆80Dec 6, 2023Updated 2 years ago
- A Python library for Real-time Music Alignment☆59Updated this week
- Official Implementation of GLAP - General Language Audio Pretraining☆64Jan 5, 2026Updated 2 months ago
- ☆41Oct 16, 2025Updated 4 months ago
- Grapheme-to-phoneme tool for corpus conversion, where phonemes match Phoible inventories☆19Apr 10, 2025Updated 10 months ago
- Code for the paper "RIR-in-a-Box : Estimating Room Acoustics from 3D Mesh Data through Shoebox Approximation" presented at Interspeech 20…☆16Sep 1, 2024Updated last year
- Repository to storage the 4mula dataset☆10Sep 1, 2021Updated 4 years ago
- Official repository of the work "Low-complexity Unsupervised Audio Anomaly Detection exploiting Separable Convolutions and Angular Loss" …☆10Nov 6, 2024Updated last year
- SChunk-Encoder (Transformer or Conformer) for streaming E2E ASR☆11Oct 21, 2022Updated 3 years ago
- A python script COMMAND LINE utility to AUTO GENERATE SUBTITLE FILE (using free Vosk Speech Recognition API) and TRANSLATED SUBTITLE FILE…☆11May 5, 2024Updated last year
- ☆10Sep 17, 2022Updated 3 years ago