[ICLR 2026] SmartDJ: declarative audio editing with audio langugae model.
☆56Mar 2, 2026Updated 3 weeks ago
Alternatives and similar repositories for SmartDJ
Users that are interested in SmartDJ are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Multi-band Frequency Reconstruction for Neural Psychoacoustic Coding☆19May 5, 2025Updated 10 months ago
- EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs☆47Sep 19, 2025Updated 6 months ago
- Official repository of Myna: Masking-Based Contrastive Learning of Musical Representations☆17Mar 31, 2025Updated 11 months ago
- This repository collects papers related to Speech Tokenizer.☆17Oct 16, 2024Updated last year
- Official implementation of "ViSAGe: Video-to-Spatial AUdio Generation" (ICLR 2025)☆41Sep 10, 2025Updated 6 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- ☆51Updated this week
- FastSAG: Towards Fast Non-Autoregressive Singing Accompaniment Generation☆29Dec 19, 2024Updated last year
- ☆13Oct 11, 2024Updated last year
- An neural full-band audio codec for general audio sampled at 48 kHz with 7.5 kps or 4.5 kbps.☆198Jul 14, 2025Updated 8 months ago
- This is the official repository of Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities☆39Updated this week
- Project for speech bubble☆58Aug 15, 2025Updated 7 months ago
- code for A Large-scale Dataset for Audio-Language Representation Learning☆14Sep 18, 2024Updated last year
- OpenAI compatible API servers for the Qwen3 TTS models☆78Mar 6, 2026Updated 3 weeks ago
- The official repo for Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation☆60Jul 2, 2025Updated 8 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆11Dec 28, 2023Updated 2 years ago
- Repository for IMUTube system☆12Jul 3, 2024Updated last year
- BMI270 I2C Python Library (bare bones)☆11Apr 28, 2023Updated 2 years ago
- [ICLR 2025] Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes☆58Oct 8, 2025Updated 5 months ago
- Real Acoustic Fields An Audio-Visual Room Acoustics Dataset and Benchmark☆61Aug 29, 2024Updated last year
- Code for Neural Volume Reconstruction for Coherent Synthetic Aperture Sonar in SIGGRAPH 2023☆22Oct 28, 2023Updated 2 years ago
- [EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers☆125Mar 20, 2025Updated last year
- 🦇 Encoder of BAT (Learning to Reason about Spatial Sounds with Large Language Models)☆74Feb 13, 2025Updated last year
- ☆13Jul 28, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [CVPR 2025] Pytorch implementation of the paper "Hearing Anywhere in Any Environment"☆29Sep 18, 2025Updated 6 months ago
- ☆12Nov 16, 2020Updated 5 years ago
- Official implementation of "sound distance estimation" WASPAA 23☆18Dec 31, 2023Updated 2 years ago
- Utilities for SignWriting☆12Mar 1, 2026Updated 3 weeks ago
- ☆15Feb 10, 2025Updated last year
- Data analysis package for cubes. https://cubeviz.readthedocs.io/en/latest/☆15Jul 20, 2021Updated 4 years ago
- [ACL 2025 Main] UniCodec: a unified audio codec with a single codebook to support multi-domain audio data, including speech, music, and s…☆153May 30, 2025Updated 9 months ago
- misuka: A differentiable room acoustic renderer☆38Feb 26, 2026Updated last month
- Code for the paper, SAMoSA - Sensing Activities with Motion and Sub-sampled Audio☆18Jan 24, 2023Updated 3 years ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- This repository contains the code for our paper: Pyramidal edge-maps and Attention based Guided thermal image Super-Resolution (PAGSR)☆17Jan 21, 2022Updated 4 years ago
- Codebase for the paper "Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating Source Separation" (ECCV2020)☆72Oct 20, 2020Updated 5 years ago
- Source code for the EMNLP 2025 paper “DM-Codec: Distilling Multimodal Representations for Speech Tokenization”☆56Jun 1, 2025Updated 9 months ago
- Non-Intrusive Appliance Load Monitoring (NILM) based on Convolutional Neural Networks for PyTorch☆11Sep 5, 2020Updated 5 years ago
- LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation (INTERSPEECH 2024)☆43Jun 13, 2024Updated last year
- AAAI 2025☆16Dec 13, 2024Updated last year
- Source code for AAAI 22 paper: Hybrid Neural Networks for On-Device Directional Hearing☆19Apr 10, 2024Updated last year