ruanvdmerwe / triplet-entropy-lossView external linksLinks
Project repository for the work done in Triplet Entropy Loss: Improving The Generalization of Short Speech Language Identification Systems
☆13Feb 17, 2021Updated 5 years ago
Alternatives and similar repositories for triplet-entropy-loss
Users that are interested in triplet-entropy-loss are comparing it to the libraries listed below
Sorting:
- ☆18Apr 12, 2017Updated 8 years ago
- Source Code for the Paper "UNIFIED KEYWORD SPOTTING AND AUDIO TAGGING ON MOBILE DEVICES WITH TRANSFORMERS"☆23Mar 6, 2023Updated 2 years ago
- Demo audio of VARA-TTS model☆20Jun 11, 2021Updated 4 years ago
- Implementation of the subscale framework from the WaveRNN paper, building on top of Fatchord's WaveRNN repo☆19Oct 8, 2020Updated 5 years ago
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Aug 5, 2024Updated last year
- Code for the paper: "Leveraging speaker attribute information using multi task learning for speaker verification and diarization" present…☆26Oct 5, 2022Updated 3 years ago
- GlottDNN vocoder and tools for training DNN excitation models☆32Feb 27, 2021Updated 4 years ago
- Creation of a multi user audio first annotation tool - GSoC 2021☆29Mar 30, 2023Updated 2 years ago
- 基于随机森林和条件随机场的中文韵律预测模型☆28Jul 25, 2024Updated last year
- C++ implementation of End to End TTS which combines both Tacatron2 and LPCNET Vocoder.☆32Oct 1, 2019Updated 6 years ago
- Text Normalization utilities for normalizing text for TTS☆20Updated this week
- GaugeMeterView is view which can be used in different Meter applications☆12Feb 25, 2022Updated 3 years ago
- An unofficial PyTorch implementation of Mix-Phoneme-Bert☆40Jul 10, 2023Updated 2 years ago
- This repository contains the code related to the paper 'DENet: a deep architecture for audio surveillance applications'.☆42Jul 23, 2023Updated 2 years ago
- ERISHA is a mulitilingual multispeaker expressive speech synthesis framework. It can transfer the expressivity to the speaker's voice for…☆43Dec 17, 2020Updated 5 years ago
- ☆41May 19, 2023Updated 2 years ago
- Machine Learning based model to predict Insurance Pure Premium☆12Jan 24, 2017Updated 9 years ago
- A python script COMMAND LINE utility to AUTO GENERATE SUBTITLE FILE (using free Vosk Speech Recognition API) and TRANSLATED SUBTITLE FILE…☆11May 5, 2024Updated last year
- Unsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.☆39Mar 4, 2024Updated last year
- A tool to collect/validate audio recordings from workers on Amazon Mechanical Turk. Written in Python/Flask. (originally hosted on github…☆14Dec 19, 2022Updated 3 years ago
- SChunk-Encoder (Transformer or Conformer) for streaming E2E ASR☆11Oct 21, 2022Updated 3 years ago
- Learning an Interpretable End-to-End Network for Real-Time Acoustic Beamforming☆15Aug 20, 2024Updated last year
- ☆10Oct 23, 2024Updated last year
- Official repository of the work "Low-complexity Unsupervised Audio Anomaly Detection exploiting Separable Convolutions and Angular Loss" …☆10Nov 6, 2024Updated last year
- [ICLR 2025] Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes☆57Oct 8, 2025Updated 4 months ago
- Grapheme-to-phoneme tool for corpus conversion, where phonemes match Phoible inventories☆19Apr 10, 2025Updated 10 months ago
- Code for the paper "RIR-in-a-Box : Estimating Room Acoustics from 3D Mesh Data through Shoebox Approximation" presented at Interspeech 20…☆15Sep 1, 2024Updated last year
- eCMU: An Efficient Phase-aware Framework for Music Source Separation with Conformer (IEEE RIVF23)☆10Oct 30, 2024Updated last year
- Conversion of Electrocardiography paper records to binarization and converting to digital form in order to extract features to feed in th…☆10Dec 16, 2020Updated 5 years ago
- 基于Dlib库的人脸表情分析与识别,四人小组,共同完成,开发周期7天,结题完善报告两天,完成时间2019年7月11日,已申请软件著作权☆12Feb 28, 2020Updated 5 years ago
- Russian phonetical transcription☆11Nov 19, 2025Updated 2 months ago
- This repository defines a python class that can be used to load data for the tf.keras.model.fit_generator function by using a torch.utils…☆11Oct 26, 2024Updated last year
- Whisper finetuning☆15Apr 9, 2025Updated 10 months ago
- ☆11Aug 11, 2023Updated 2 years ago
- ☆13Oct 9, 2025Updated 4 months ago
- notes and key points on ML / DL papers☆12Jun 12, 2022Updated 3 years ago
- Dataset release for Emotional TTS in Indian Accent☆40Sep 2, 2022Updated 3 years ago
- NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling☆37May 25, 2021Updated 4 years ago
- Using YouTube to prepare a speech recognition dataset for any language☆10Mar 30, 2021Updated 4 years ago