mechanicalsea/spectra

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/mechanicalsea/spectra)

mechanicalsea / spectra

Spectra extraction tutorials based on torch and torchaudio.

☆41

Alternatives and similar repositories for spectra

Users that are interested in spectra are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

qinxiaoyi / Simple-Attention-Module-based-Speaker-Verification-with-Iterative-Noisy-Label-Detection
View on GitHub
☆12Jun 14, 2022Updated 4 years ago
smallflyingpig / learning-to-fool-the-speaker-recognition
View on GitHub
code for paper "learning to fool the speaker recognition"
☆10Jun 12, 2020Updated 6 years ago
egorsmkv / asr-corpus-creator
View on GitHub
This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.
☆27Feb 15, 2024Updated 2 years ago
mechanicalsea / sugar
View on GitHub
Efficient Speech Processing Tookit for Automatic Speaker Recognition
☆17Feb 8, 2023Updated 3 years ago
arasgungore / PCM-and-DM-modulators
View on GitHub
A Python/MATLAB project which implements pulse-code modulation (PCM) and delta modulation (DM).
☆13Aug 8, 2022Updated 3 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
Anwarvic / CNN-for-Raw-Waveforms
View on GitHub
This is my PyTorch implementation of the "Very Deep Convolutional Neural Networks For Raw Waveforms" research paper published in 2016.
☆17Aug 24, 2021Updated 4 years ago
nicklashansen / voice-activity-detection
View on GitHub
Voice Activity Detection (VAD) using deep learning.
☆204Oct 14, 2019Updated 6 years ago
yogeshbalaji / Normalized-Wasserstein
View on GitHub
Normalized Wasserstein for Mixture Distributions
☆11Mar 24, 2023Updated 3 years ago
echocatzh / torch-mfcc
View on GitHub
A librosa STFT/Fbank/mfcc feature extration written up in PyTorch using 1D Convolutions.
☆79Aug 19, 2022Updated 3 years ago
vadimkantorov / convasr
View on GitHub
Baseline convolutional ASR system in PyTorch
☆21Nov 16, 2023Updated 2 years ago
gaochangw / DeltaRNN
View on GitHub
Latest PyTorch Implementation of DeltaGRU & DeltaLSTM that Exploits Temporal Sparsity in Sequential Data
☆18Sep 30, 2023Updated 2 years ago
alxmamaev / ultimate_tts
View on GitHub
☆13Aug 7, 2021Updated 4 years ago
GeWanying / shap-anti-spoofing
View on GitHub
This repository includes the code to reproduce our paper [Explainable deepfake and spoofing detection: an attack analysis using SHapley A…
☆12Jan 24, 2024Updated 2 years ago
zyzisyz / mfa_conformer
View on GitHub
☆160Jan 9, 2023Updated 3 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
xmos / lib_dsp
View on GitHub
Core digital signal processing function library
☆23May 20, 2025Updated last year
Dsplib / dspl
View on GitHub
Digital Signal Processing Library
☆15May 4, 2017Updated 9 years ago
celebrity-audio-collection / videoprocess
View on GitHub
CN-Celeb, a large-scale Chinese celebrities dataset published by Center for Speech and Language Technology (CSLT) at Tsinghua University.
☆80Nov 9, 2019Updated 6 years ago
marsbroshok / VAD-python
View on GitHub
Voice Activity Detector in Python
☆481Nov 17, 2020Updated 5 years ago
york135 / CTC_CE_for_AST
View on GitHub
The official repo/implementation of the paper "Training a Singing Transcription Model Using Connectionist Temporal Classification Loss an…
☆12Mar 25, 2025Updated last year
AlbertiPot / nar
View on GitHub
codes for Neural Architecture Ranker and detailed cell information datasets based on NAS-Bench series
☆12Jul 11, 2022Updated 4 years ago
SunnyCYC / aug4beat
View on GitHub
☆17Dec 17, 2025Updated 7 months ago
ChasTechProjects / Debian64Pi-old
View on GitHub
64-bit Debian Stretch for the Raspberry Pi 3
☆12Dec 27, 2018Updated 7 years ago
SShirleyy / Adaptive-audio-filter-design
View on GitHub
Basing on Adaptive Line Enhancer/Canceler technique to reduce tonal noise by using LMS, RLS, NLMS and Kalman adaptive filter.
☆21May 17, 2018Updated 8 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
AmitProspeed / LMS-Adaptive-Filter
View on GitHub
An Adaptive Line Enhancer (ALE) based on Least Mean Square (LMS) algorithm to eliminate broadband noise from a narrowband signal
☆24Mar 31, 2019Updated 7 years ago
LCF2764 / autoKWS2021_1st_solution
View on GitHub
Auto-KWS 2021 Challenge 1st place solution.
☆11Jul 20, 2021Updated 5 years ago
zaocan666 / DyViSE
View on GitHub
Dynamic vision-guided speaker embedding for audio-visual speaker diarization
☆12Jul 5, 2022Updated 4 years ago
VITA-Group / Audio-Lottery
View on GitHub
[ICLR 2022] "Audio Lottery: Speech Recognition Made Ultra-Lightweight, Noise-Robust, and Transferable", by Shaojin Ding, Tianlong Chen, Z…
☆32Apr 8, 2022Updated 4 years ago
ankitshah009 / WALNet-Weak_Label_Analysis
View on GitHub
Repository for Weak Label Learning for Audio Events - A closer look. Uses Audioset subset data provided for reproducibility.
☆32Sep 13, 2023Updated 2 years ago
nDmitry / ogimgd
View on GitHub
Social previews generator as a microservice.
☆12Apr 9, 2022Updated 4 years ago
manojpamk / pytorch_xvectors
View on GitHub
Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196
☆321Nov 11, 2020Updated 5 years ago
jongwook / crepe
View on GitHub
☆12Jun 5, 2018Updated 8 years ago
ankurdhuriya / multispeaker-glow-tts
View on GitHub
☆11Jan 28, 2022Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
awslabs / speech-representations
View on GitHub
Code for DeCoAR (ICASSP 2020) and BERTphone (Odyssey 2020)
☆104Nov 26, 2022Updated 3 years ago
singularityhub / singularity-compose-examples
View on GitHub
A simple example of running a MongoDB instance to query a database
☆10Aug 31, 2022Updated 3 years ago
42io / tflite_kws
View on GitHub
☆13May 1, 2026Updated 2 months ago
egorsmkv / optimized-whisper
View on GitHub
Use quantized versions of Whisper to speed up inference
☆12Oct 16, 2024Updated last year
xk-wang / MusicYOLO
View on GitHub
MusicYOLO framework uses the object detection model, YOLOx, to locate notes in the spectrogram.
☆11Jan 29, 2022Updated 4 years ago
ccoreilly / wav2vec2-catala
View on GitHub
Wav2Vec 2.0 catalan training scripts and models
☆12Jun 18, 2021Updated 5 years ago
filippogiruzzi / voice_activity_detection
View on GitHub
Voice Activity Detection based on Deep Learning & TensorFlow
☆373Jul 22, 2026Updated last week