iamyuanchung/VQ-APC

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/iamyuanchung/VQ-APC)

iamyuanchung / VQ-APC

Vector Quantized Autoregressive Predictive Coding (VQ-APC)

☆38

Alternatives and similar repositories for VQ-APC

Users that are interested in VQ-APC are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Alexander-H-Liu / NPC
View on GitHub
Non-Autoregressive Predictive Coding
☆51Nov 3, 2020Updated 5 years ago
wnhsu / ResDAVEnet-VQ
View on GitHub
Official codes for the paper "Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech"
☆28Feb 22, 2022Updated 4 years ago
iamyuanchung / Autoregressive-Predictive-Coding
View on GitHub
Autoregressive Predictive Coding: An unsupervised autoregressive model for speech representation learning
☆191Jan 29, 2020Updated 6 years ago
idiap / apam
View on GitHub
APAM toolkit is built on PyTorch and provides recipes to adapt pretrained acoustic models with a variety of sequence discriminative train…
☆14Feb 15, 2021Updated 5 years ago
ttaoREtw / semi-tts
View on GitHub
Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation
☆39Jul 16, 2020Updated 6 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
jasonppy / FaST-VGS-Family
View on GitHub
Transformer-based visually grounded speech models
☆19Sep 22, 2022Updated 3 years ago
awslabs / speech-representations
View on GitHub
Code for DeCoAR (ICASSP 2020) and BERTphone (Odyssey 2020)
☆104Nov 26, 2022Updated 3 years ago
audioku / cross-accent-maml-asr
View on GitHub
Meta-learning model agnostic (MAML) implementation for cross-accented ASR
☆45Feb 9, 2024Updated 2 years ago
wenet-e2e / wecut
View on GitHub
video cut powered by AI
☆23Nov 15, 2022Updated 3 years ago
distsup / DistSup
View on GitHub
Representation learning for NLP @ JSALT19
☆41Oct 31, 2020Updated 5 years ago
MiuLab / TaylorGAN
View on GitHub
☆31Apr 24, 2021Updated 5 years ago
unilight / cdvae-vc
View on GitHub
TensorFlow Implementation of CDVAE-VC.
☆54Mar 24, 2023Updated 3 years ago
facebookresearch / CPC_audio
View on GitHub
An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.
☆374Oct 12, 2021Updated 4 years ago
gcucurull / jax-gat
View on GitHub
JAX implementation of Graph Attention Networks
☆13Jan 29, 2022Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Hertin / WavPrompt
View on GitHub
☆37Jun 30, 2022Updated 4 years ago
shaojinding / GroupLatentEmbedding
View on GitHub
Pytorch implementation of "Group Latent Embedding for Vector Quantized Variational Autoencoder in Non-Parallel Voice Conversion" [Intersp…
☆28Sep 17, 2019Updated 6 years ago
1ytic / warp-rna
View on GitHub
Recurrent Neural Aligner
☆51Apr 14, 2020Updated 6 years ago
athena-team / athena-decoder
View on GitHub
☆76Mar 18, 2022Updated 4 years ago
iamjanvijay / rnnt_decoder_cuda
View on GitHub
An efficient implementation of RNN-T Prefix Beam Search in C++/CUDA.
☆67Jan 7, 2026Updated 6 months ago
cornerfarmer / ctc_segmentation
View on GitHub
Segment a given audio into utterances using a trained end-to-end ASR model.
☆75Oct 9, 2020Updated 5 years ago
jaywalnut310 / Vector-Quantized-Autoencoders
View on GitHub
Tensorflow Implementation of "Theory and Experiments on Vector Quantized Autoencoders"
☆15Feb 27, 2019Updated 7 years ago
martinmamql / relative_predictive_coding
View on GitHub
Project page for paper Self-supervised Representation Learning with Relative Predictive Coding
☆19Jul 8, 2021Updated 5 years ago
idiap / icassp-oov-recognition
View on GitHub
Data and code related to the ICASSP submission "A comparison of methods for OOV-word recognition"
☆17Nov 28, 2021Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
JeongHun0716 / e-mvsr
View on GitHub
Efficient Training for Multilingual Visual Speech Recognition: Pre-training with Discretized Visual Speech Representation (ACM MM 2024)
☆20Mar 17, 2025Updated last year
coqui-ai / inference-engine
View on GitHub
Coqui Inference Engine
☆41Aug 3, 2021Updated 4 years ago
xinjli / alqalign
View on GitHub
multilingual speech aligner
☆78Nov 19, 2023Updated 2 years ago
BUTSpeechFIT / ASR-hybrid-decoding
View on GitHub
☆17Nov 25, 2019Updated 6 years ago
dharwath / DAVEnet-pytorch
View on GitHub
Deep Audio-Visual Embedding network (DAVEnet) implementation in PyTorch
☆66Aug 31, 2018Updated 7 years ago
seujung / WaveNet-gluon
View on GitHub
Implementation of WaveNet with Gluon
☆16Dec 27, 2018Updated 7 years ago
mvansegbroeck-zz / featxtra
View on GitHub
Kaldi Speech Processing Tools
☆25Nov 16, 2018Updated 7 years ago
HS-YN / PanoAVQA
View on GitHub
Official repository of PanoAVQA: Grounded Audio-Visual Question Answering in 360° Videos (ICCV 2021)
☆16Oct 12, 2021Updated 4 years ago
kamperh / vqwordseg
View on GitHub
Unsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.
☆39May 5, 2026Updated 2 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
umbertocappellazzo / Llama-AVSR
View on GitHub
Official Pytorch implementation of "Large Language Models are Strong Audio-Visual Speech Recognition Learners" [ICASSP 2025] and "Mitigat…
☆64Jan 18, 2026Updated 6 months ago
JeongHun0716 / zero-avsr
View on GitHub
Official PyTorch implementation for "Zero-AVSR: Zero-Shot Audio-Visual Speech Recognition with LLMs by Learning Language-Agnostic Speech …
☆36May 11, 2025Updated last year
jasonppy / word-discovery
View on GitHub
Word Discovery in Visually Grounded, Self-Supervised Speech Models
☆27Dec 4, 2023Updated 2 years ago
l3das / L3DAS22
View on GitHub
☆57Jun 4, 2022Updated 4 years ago
andi611 / ZeroSpeech-TTS-without-T
View on GitHub
A Pytorch implementation for the ZeroSpeech 2019 challenge.
☆112Nov 12, 2019Updated 6 years ago
Mddct / transformer-vocos
View on GitHub
☆35Sep 6, 2025Updated 10 months ago
vivsivaraman / sourcesepganprior
View on GitHub
☆18May 15, 2021Updated 5 years ago