[ICLR 2026] SmartDJ: declarative audio editing with audio langugae model.
☆66Apr 25, 2026Updated last month
Alternatives and similar repositories for SmartDJ
Users that are interested in SmartDJ are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Multi-band Frequency Reconstruction for Neural Psychoacoustic Coding☆19May 5, 2025Updated last year
- EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs☆47Sep 19, 2025Updated 8 months ago
- Official repository of Myna: Masking-Based Contrastive Learning of Musical Representations☆17Mar 31, 2025Updated last year
- Talker-T2AV Joint Talking Audio-Video Generation with Autoregressive Diffusion Modeling☆75May 24, 2026Updated 3 weeks ago
- This repository collects papers related to Speech Tokenizer.☆18Oct 16, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- FastSAG: Towards Fast Non-Autoregressive Singing Accompaniment Generation☆29Dec 19, 2024Updated last year
- ☆53Mar 24, 2026Updated 2 months ago
- ☆13Oct 11, 2024Updated last year
- An neural full-band audio codec for general audio sampled at 48 kHz with 7.5 kps or 4.5 kbps.☆210Jun 5, 2026Updated last week
- This is the official repository of Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities☆42Apr 28, 2026Updated last month
- code for A Large-scale Dataset for Audio-Language Representation Learning☆14Sep 18, 2024Updated last year
- Project for speech bubble☆65Aug 15, 2025Updated 10 months ago
- The official repo for Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation☆65Jul 2, 2025Updated 11 months ago
- ☆11Dec 28, 2023Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Repository for IMUTube system☆12Jul 3, 2024Updated last year
- ☆10Feb 18, 2022Updated 4 years ago
- OpenAI compatible API servers for the Qwen3 TTS models☆83May 19, 2026Updated 3 weeks ago
- BMI270 I2C Python Library (bare bones)☆11Apr 28, 2023Updated 3 years ago
- Real Acoustic Fields An Audio-Visual Room Acoustics Dataset and Benchmark☆63Aug 29, 2024Updated last year
- Code for Neural Volume Reconstruction for Coherent Synthetic Aperture Sonar in SIGGRAPH 2023☆23Oct 28, 2023Updated 2 years ago
- [EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers☆125Mar 20, 2025Updated last year
- [ICLR 2025] Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes☆77Oct 8, 2025Updated 8 months ago
- 🦇 Encoder of BAT (Learning to Reason about Spatial Sounds with Large Language Models)☆87Feb 13, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆13Jul 28, 2023Updated 2 years ago
- ☆12Nov 16, 2020Updated 5 years ago
- Utilities for SignWriting☆12Updated this week
- Official implementation of "sound distance estimation" WASPAA 23☆20Dec 31, 2023Updated 2 years ago
- ☆15Feb 10, 2025Updated last year
- Data analysis package for cubes. https://cubeviz.readthedocs.io/en/latest/☆15Jul 20, 2021Updated 4 years ago
- [ACL 2025 Main] UniCodec: a unified audio codec with a single codebook to support multi-domain audio data, including speech, music, and s…☆154May 30, 2025Updated last year
- misuka: A differentiable room acoustic renderer☆41May 12, 2026Updated last month
- This repository contains the code for our paper: Pyramidal edge-maps and Attention based Guided thermal image Super-Resolution (PAGSR)☆17Jan 21, 2022Updated 4 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Codebase for the paper "Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating Source Separation" (ECCV2020)☆72Oct 20, 2020Updated 5 years ago
- Source code for the EMNLP 2025 paper “DM-Codec: Distilling Multimodal Representations for Speech Tokenization”☆57Jun 1, 2025Updated last year
- Non-Intrusive Appliance Load Monitoring (NILM) based on Convolutional Neural Networks for PyTorch☆11Sep 5, 2020Updated 5 years ago
- LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation (INTERSPEECH 2024)☆44Jun 13, 2024Updated 2 years ago
- AAAI 2025☆16Dec 13, 2024Updated last year
- Source code for AAAI 22 paper: Hybrid Neural Networks for On-Device Directional Hearing☆19Apr 10, 2024Updated 2 years ago
- ☆13Jan 3, 2024Updated 2 years ago