PeiwenSun2000/Both-Ears-Wide-Open

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/PeiwenSun2000/Both-Ears-Wide-Open)

PeiwenSun2000 / Both-Ears-Wide-Open

The official repo for Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation

☆65

Alternatives and similar repositories for Both-Ears-Wide-Open

Users that are interested in Both-Ears-Wide-Open are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

zszheng147 / Spatial-AST
View on GitHub
🦇 Encoder of BAT (Learning to Reason about Spatial Sounds with Large Language Models)
☆87Feb 13, 2025Updated last year
Ego4DSounds / Ego4DSounds
View on GitHub
Ego4DSounds: A diverse egocentric dataset with high action-audio correspondence
☆21Jun 14, 2024Updated 2 years ago
MRSAudio / MRSAudio_Main
View on GitHub
MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations
☆43Oct 15, 2025Updated 9 months ago
Orlllem / seld_wav2vec2
View on GitHub
☆18Feb 1, 2026Updated 5 months ago
jin-woo-lee / nfs-binaural
View on GitHub
☆13Aug 13, 2023Updated 2 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
vTAD2025-Challenge / vTAD
View on GitHub
☆17Oct 24, 2025Updated 9 months ago
jaeyeonkim99 / visage
View on GitHub
Official implementation of "ViSAGe: Video-to-Spatial AUdio Generation" (ICLR 2025)
☆47Sep 10, 2025Updated 10 months ago
liuhuadai / OmniAudio
View on GitHub
[ICML 2025] PyTorch Implementation of "OmniAudio: Generating Spatial Audio from 360-Degree Video"
☆375Jun 27, 2025Updated last year
SAKi-77 / DiffStereo
View on GitHub
DiffStereo: End-to-End Mono-to-Stereo Audio Generation with Diffusion Transformer
☆16Apr 17, 2026Updated 3 months ago
dieKarotte / ASAudio
View on GitHub
☆59Oct 19, 2025Updated 9 months ago
Stability-AI / stable-audio-metrics
View on GitHub
Metrics for evaluating music and audio generative models – with a focus on long-form, full-band, and stereo generations.
☆300Updated this week
ilpoviertola / V-AURA
View on GitHub
The official implementation of V-AURA: Temporally Aligned Audio for Video with Autoregression (ICASSP 2025) (Oral)
☆35Feb 11, 2026Updated 5 months ago
cmots / UniSS
View on GitHub
Official inference code for UniSS: Unified Expressive Speech-to-Speech Translation with Your Voice.
☆31May 30, 2026Updated last month
IoSR-Surrey / IoSR_ListeningRoom_BRIRs
View on GitHub
The IoSR listening room multichannel BRIR dataset contains binaural room impulse responses measured at head angles of 0 to 360 degrees in…
☆22Mar 24, 2017Updated 9 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
MaikeZuefle / f-actor
View on GitHub
☆28Jul 17, 2026Updated last week
SonyResearch / dcase2025_stereo_seld_data_generator
View on GitHub
Data generator for stereo sound event localization and detection task of DCASE 2025 challenge
☆17Jul 17, 2025Updated last year
PeiwenSun2000 / SpaceVista
View on GitHub
The official repo for SpaceVista: All-Scale Visual Spatial Reasoning from mm to km.
☆43May 26, 2026Updated 2 months ago
wilkinghoff / DSpAST
View on GitHub
Code for the paper "DSpAST: Disentangled Representations for Spatial Audio Reasoning with Large Language Models"
☆17Oct 23, 2025Updated 9 months ago
bytedance / Make-An-Audio-2
View on GitHub
a text-conditional diffusion probabilistic model capable of generating high fidelity audio.
☆197May 29, 2024Updated 2 years ago
phenyque / pyvbap
View on GitHub
VBAP (Vector base amplitude panning) implementation in python with example application.
☆18Nov 2, 2024Updated last year
anton-jeran / AV-RIR
View on GitHub
Audio-Visual Room Impulse Response Estimation
☆25Jul 22, 2024Updated 2 years ago
xiquan-li / Awesome-Audio-Generation
View on GitHub
Curated list for papers, codes and resources related to Text-to-Audio (TTA) Generation
☆75Jul 20, 2026Updated last week
IFICL / stereocrw
View on GitHub
Code for the Paper: [ECCV2022] Sound Localization by Self-Supervised Time-Delay Estimation
☆28Mar 15, 2023Updated 3 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
jianzongwu / Does-Hearing-Help-Seeing
View on GitHub
☆19Dec 3, 2025Updated 7 months ago
ichi131 / Direction-based-BiTSE
View on GitHub
☆15Sep 19, 2024Updated last year
juliawilkins / ambisonics2binaural_simple
View on GitHub
A simple Python script to convert FOA audio to binaural.
☆17Nov 29, 2022Updated 3 years ago
chenjianyi / fastsag
View on GitHub
FastSAG: Towards Fast Non-Autoregressive Singing Accompaniment Generation
☆29Dec 19, 2024Updated last year
MorenoLaQuatra / audioset-download
View on GitHub
This package aims at simplifying the download of the AudioSet dataset.
☆60Jul 17, 2025Updated last year
facebookresearch / audiobox-aesthetics
View on GitHub
Unified automatic quality assessment for speech, music, and sound.
☆747Jun 5, 2025Updated last year
jnwnlee / video-foley
View on GitHub
Official implementation of "Video-Foley: Two-Stage Video-To-Sound Generation via Temporal Event Condition For Foley Sound". IEEE TASLP 20…
☆19Feb 27, 2026Updated 5 months ago
SonyResearch / LLM2Fx
View on GitHub
Large Language Models for Music Post Production
☆46Mar 31, 2026Updated 3 months ago
PeiwenSun2000 / X-Stream
View on GitHub
Official Repo of "$X$-Stream: Exploring MLLMs as Multiplexers for Multi-Stream Understanding"
☆33Jun 18, 2026Updated last month
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
thomasdeppisch / eMagLS
View on GitHub
The End-to-End Magnitude Least Squares Binaural Renderer for Spherical Microphone Array Signals
☆41Feb 17, 2026Updated 5 months ago
bingo-todd / WaveLoc
View on GitHub
End-to-End binaural sound localization
☆17Feb 27, 2020Updated 6 years ago
b-sigpro / sed-hsmm
View on GitHub
Onset-and-Offset-Aware Sound Event Detection
☆21Feb 10, 2025Updated last year
facebookresearch / BinauralSpeechSynthesis
View on GitHub
N/A
☆190May 19, 2022Updated 4 years ago
anton-jeran / MULTI-AUDIODEC
View on GitHub
This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.
☆55Mar 17, 2025Updated last year
nttrd-mdlab / wearable-seld-dataset
View on GitHub
☆10Feb 18, 2022Updated 4 years ago
IoSR-Surrey / RealRoomBRIRs
View on GitHub
Binaural impulse responses captured in real rooms.
☆41Mar 9, 2016Updated 10 years ago