archinetai/audio-encoders-pytorch

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/archinetai/audio-encoders-pytorch)

archinetai / audio-encoders-pytorch

A collection of audio autoencoders, in PyTorch.

☆44

Alternatives and similar repositories for audio-encoders-pytorch

Users that are interested in audio-encoders-pytorch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

archinetai / archisound
View on GitHub
A collection of pre-trained audio models, in PyTorch.
☆116Jan 27, 2023Updated 3 years ago
archinetai / audio-diffusion-pytorch-trainer
View on GitHub
Trainer for audio-diffusion-pytorch
☆129Jan 13, 2023Updated 3 years ago
fork123aniket / Visual-Contrastive-Learning-for-Few-shot-Image-Classification
View on GitHub
Implementation of Few-shot Binary Image Classification using Contrastive Learning-based Approach in PyTorch
☆11May 1, 2023Updated 3 years ago
archinetai / a-unet
View on GitHub
A toolbox that provides hackable building blocks for generic 1D/2D/3D UNets, in PyTorch.
☆88Jun 12, 2023Updated 3 years ago
archinetai / audio-data-pytorch
View on GitHub
A collection of useful audio datasets and transforms for PyTorch.
☆144Feb 11, 2023Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
MuyangDu / T5Voice
View on GitHub
T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …
☆28Nov 7, 2025Updated 8 months ago
xirongc / watermark-audio-diffusion
View on GitHub
☆12Nov 21, 2023Updated 2 years ago
MTG / PodcastMix-inference
View on GitHub
☆32Jan 6, 2022Updated 4 years ago
usc-sail / mica-subtitle-aligned-movie-sounds
View on GitHub
A dataset for Audio-Visual Sound Event Detection in Movies
☆26Jan 23, 2023Updated 3 years ago
yoongi43 / VRVQ
View on GitHub
Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"
☆11Apr 10, 2025Updated last year
JustinYuu / MM_Pyramid
View on GitHub
[ACM MM 2022] MM_Pyramid: Multimodal Pyramid Attentional Network for Audio-Visual Event Localization and Video Parsing
☆15Aug 26, 2022Updated 3 years ago
Kinyugo / msanii
View on GitHub
A novel diffusion-based model for synthesizing long-context, high-fidelity music efficiently.
☆196Apr 27, 2023Updated 3 years ago
EGO4D / audio-visual
View on GitHub
☆69Sep 13, 2022Updated 3 years ago
teticio / deej-ai.online-app
View on GitHub
ReactJS website to automatically generate playlists based on how the music sounds.
☆33May 10, 2026Updated 2 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
shengcanxu / canoSpeech
View on GitHub
text to speech
☆10Mar 19, 2024Updated 2 years ago
OpenGVLab / LORIS
View on GitHub
[ICML2023] Long-Term Rhythmic Video Soundtracker
☆63Jul 28, 2025Updated last year
Yuanshi9815 / LiteFocus
View on GitHub
[Interspeech 2024] LiteFocus is a tool designed to accelerate diffusion-based TTA model, now implemented with the base model AudioLDM2.
☆34Mar 11, 2025Updated last year
TencentARC / common_trainer
View on GitHub
Common template for pytorch project. Easy to extent and modify for new project.
☆13Dec 13, 2022Updated 3 years ago
voidful / vall-e-encodec
View on GitHub
☆41May 15, 2023Updated 3 years ago
primepake / dac_vae
View on GitHub
Descript Audio Codec - VAE Variant (.dac-vae): High-Fidelity Audio Compression with Variational Autoencoder
☆38Aug 30, 2025Updated 10 months ago
TehreemFarooqi / Preparing-a-speech-recognition-dataset-using-YouTube-videos
View on GitHub
Using YouTube to prepare a speech recognition dataset for any language
☆10Mar 30, 2021Updated 5 years ago
horvathandris / dime
View on GitHub
An ISO-4217 currency library for Gleam
☆13Jul 20, 2026Updated last week
flavioschneider / master-thesis
View on GitHub
☆28Jan 17, 2023Updated 3 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
jasongief / CPSP
View on GitHub
[2022 TPAMI] Contrastive Positive Sample Propagation along the Audio-Visual Event Line
☆32Mar 6, 2023Updated 3 years ago
RayYuki / CodecBench
View on GitHub
☆24Nov 16, 2025Updated 8 months ago
WangHelin1997 / Aty-TTS
View on GitHub
Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech
☆11May 14, 2025Updated last year
cychomatica / AudioPure
View on GitHub
Defending against Adversarial Audio via Diffusion Model (ICLR 2023)
☆35Mar 2, 2023Updated 3 years ago
zqevans / audio-diffusion
View on GitHub
☆87May 31, 2023Updated 3 years ago
drscotthawley / fad_pytorch
View on GitHub
Frechet Audio Distance evaluation in PyTorch
☆36Jun 9, 2023Updated 3 years ago
L-YeZhu / D2M-GAN
View on GitHub
[ECCV2022] D2M-GAN for music generation from dance videos
☆85Aug 16, 2022Updated 3 years ago
archinetai / aligner-pytorch
View on GitHub
Sequence alignement methods with helpers for PyTorch.
☆24Nov 30, 2022Updated 3 years ago
tstafylakis / Speaker-Embeddings-Correlation-Pooling
View on GitHub
Original implementation of the pooling method introduced in "Speaker embeddings by modeling channel-wise correlations"
☆11Sep 20, 2021Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
WikiChao / Ego-AV-Loc
View on GitHub
[CVPR 2023] Egocentric Audio-Visual Object Localization
☆27Jan 6, 2024Updated 2 years ago
jeremyjordan / midi-lm
View on GitHub
Generative modeling of MIDI files
☆18Mar 7, 2024Updated 2 years ago
rishikksh20 / MiniMax-TTS-pytorch
View on GitHub
Try to replicate the architecture of MiniMaxTTS mentioned in it's technical report
☆47Sep 2, 2025Updated 10 months ago
HackAudio / juce-pedal-demo
View on GitHub
GUI demonstration material used in a Belmont Homecoming Alumni Takeover course
☆11Feb 11, 2020Updated 6 years ago
archinetai / audio-diffusion-pytorch
View on GitHub
Audio generation using diffusion models, in PyTorch.
☆2,095Jun 12, 2023Updated 3 years ago
nikolasibalic / Interactive-Publishing
View on GitHub
Templates and tools for creating interactive figures and interactive text for publishing in EPUB3/HTML5.
☆20Jun 4, 2024Updated 2 years ago
MaxxP0 / WorldModel
View on GitHub
WorldModel is a MaskGIT model trained on 8x8x8 Minecraft voxel volumes. Beyond generating blocks from scratch, it excels in filling space…
☆14Sep 12, 2023Updated 2 years ago