[ICLR 2026] SmartDJ: declarative audio editing with audio langugae model.
☆65Apr 25, 2026Updated last month
Alternatives and similar repositories for SmartDJ
Users that are interested in SmartDJ are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Multi-band Frequency Reconstruction for Neural Psychoacoustic Coding☆19May 5, 2025Updated last year
- EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs☆47Sep 19, 2025Updated 8 months ago
- Official repository of Myna: Masking-Based Contrastive Learning of Musical Representations☆17Mar 31, 2025Updated last year
- Official implementation of "ViSAGe: Video-to-Spatial AUdio Generation" (ICLR 2025)☆45Sep 10, 2025Updated 8 months ago
- This repository collects papers related to Speech Tokenizer.☆18Oct 16, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- FastSAG: Towards Fast Non-Autoregressive Singing Accompaniment Generation☆29Dec 19, 2024Updated last year
- ☆53Mar 24, 2026Updated 2 months ago
- ☆13Oct 11, 2024Updated last year
- An neural full-band audio codec for general audio sampled at 48 kHz with 7.5 kps or 4.5 kbps.☆206Apr 30, 2026Updated 3 weeks ago
- This is the official repository of Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities☆41Apr 28, 2026Updated 3 weeks ago
- code for A Large-scale Dataset for Audio-Language Representation Learning☆14Sep 18, 2024Updated last year
- Project for speech bubble☆63Aug 15, 2025Updated 9 months ago
- The official repo for Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation☆64Jul 2, 2025Updated 10 months ago
- ☆11Dec 28, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Repository for IMUTube system☆12Jul 3, 2024Updated last year
- ☆10Feb 18, 2022Updated 4 years ago
- OpenAI compatible API servers for the Qwen3 TTS models☆82May 19, 2026Updated last week
- BMI270 I2C Python Library (bare bones)☆11Apr 28, 2023Updated 3 years ago
- Real Acoustic Fields An Audio-Visual Room Acoustics Dataset and Benchmark☆61Aug 29, 2024Updated last year
- Code for Neural Volume Reconstruction for Coherent Synthetic Aperture Sonar in SIGGRAPH 2023☆23Oct 28, 2023Updated 2 years ago
- [EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers☆125Mar 20, 2025Updated last year
- [ICLR 2025] Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes☆72Oct 8, 2025Updated 7 months ago
- 🦇 Encoder of BAT (Learning to Reason about Spatial Sounds with Large Language Models)☆84Feb 13, 2025Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆13Jul 28, 2023Updated 2 years ago
- ☆12Nov 16, 2020Updated 5 years ago
- Utilities for SignWriting☆12Updated this week
- Official implementation of "sound distance estimation" WASPAA 23☆20Dec 31, 2023Updated 2 years ago
- ☆15Feb 10, 2025Updated last year
- Data analysis package for cubes. https://cubeviz.readthedocs.io/en/latest/☆15Jul 20, 2021Updated 4 years ago
- [ACL 2025 Main] UniCodec: a unified audio codec with a single codebook to support multi-domain audio data, including speech, music, and s…☆154May 30, 2025Updated 11 months ago
- misuka: A differentiable room acoustic renderer☆40May 12, 2026Updated 2 weeks ago
- This repository contains the code for our paper: Pyramidal edge-maps and Attention based Guided thermal image Super-Resolution (PAGSR)☆17Jan 21, 2022Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Codebase for the paper "Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating Source Separation" (ECCV2020)☆72Oct 20, 2020Updated 5 years ago
- Source code for the EMNLP 2025 paper “DM-Codec: Distilling Multimodal Representations for Speech Tokenization”☆57Jun 1, 2025Updated 11 months ago
- Non-Intrusive Appliance Load Monitoring (NILM) based on Convolutional Neural Networks for PyTorch☆11Sep 5, 2020Updated 5 years ago
- LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation (INTERSPEECH 2024)☆44Jun 13, 2024Updated last year
- AAAI 2025☆16Dec 13, 2024Updated last year
- Source code for AAAI 22 paper: Hybrid Neural Networks for On-Device Directional Hearing☆19Apr 10, 2024Updated 2 years ago
- ☆13Jan 3, 2024Updated 2 years ago