The official repo for paper "Spatial Speech Translation: Translating Across Space With Binaural Hearables"
☆74Aug 15, 2025Updated 9 months ago
Alternatives and similar repositories for Spatial-Speech-Translation
Users that are interested in Spatial-Speech-Translation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆31Jul 31, 2025Updated 10 months ago
- ☆14May 20, 2025Updated last year
- Microphone Array Real-time System☆13Jun 7, 2017Updated 9 years ago
- ☆15Apr 6, 2026Updated 2 months ago
- ☆63Jul 1, 2025Updated 11 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆15Apr 11, 2024Updated 2 years ago
- ☆22Aug 21, 2025Updated 9 months ago
- This is the code and dataset repo for Interspeech 2024 paper "Target conversation extraction: Source separation using turn-taking dynamic…☆58Aug 15, 2025Updated 9 months ago
- ☆21Jul 15, 2024Updated last year
- Official implementation of the paper "MusicInfuser: Making Video Diffusion Listen and Dance" (CVPR`26)☆83May 3, 2026Updated last month
- Official GPU implementation of the paper "PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance"☆133Nov 19, 2024Updated last year
- ☆166Nov 29, 2024Updated last year
- ☆10Jul 25, 2023Updated 2 years ago
- Master repository for 3D Spatial Audio Reproduction Toolbox☆22Jul 25, 2016Updated 9 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆478May 19, 2025Updated last year
- Async MCP server with Minimax API integration for image generation and text-to-speech☆50Jan 29, 2026Updated 4 months ago
- ☆607Oct 26, 2024Updated last year
- ☆20Jul 19, 2024Updated last year
- [ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…☆30Jul 24, 2025Updated 10 months ago
- AdaSkip: Adaptive Sublayer Skipping for Accelerating Long-Context LLM Inference☆20Jan 24, 2025Updated last year
- Implementation of "Look, Listen and Recognise:character-aware audio-visual subtitling"☆21Nov 3, 2025Updated 7 months ago
- ☆108Apr 4, 2026Updated 2 months ago
- Optimized noise library for C# using SIMD. Works with both Unity Burst and .NET Core.☆48Mar 31, 2026Updated 2 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A unified robotic manipulation learning framework☆22Sep 4, 2025Updated 9 months ago
- ☆21Jul 25, 2023Updated 2 years ago
- 🕵 Code for our EMNLP 2025 Main paper: "FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games"☆26Apr 26, 2026Updated last month
- [SIGGRAPH Asia 2025] CHARM: Control-point-based 3D Anime Hairstyle Auto-Regressive Modeling☆49Apr 17, 2026Updated last month
- This is the official repository of Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities☆42Apr 28, 2026Updated last month
- Official PyTorch inference code for the Interspeech 2025 paper: Efficient Speech Enhancement via Embeddings from Pre-trained Generative A…☆80Jun 16, 2025Updated 11 months ago
- Multi speaker audio transcription☆46Nov 25, 2022Updated 3 years ago
- [ICML 2024] Sparse Model Inversion: Efficient Inversion of Vision Transformers with Less Hallucination☆14Apr 29, 2025Updated last year
- Chrome extension to add a link from each Arxiv page to the corresponding HF Paper page☆26Jan 4, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Big Impulse Response Dataset☆159Oct 19, 2022Updated 3 years ago
- [ACL2025 Findings] Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models☆89May 20, 2025Updated last year
- [ICME 2025] DiffusionTalker: Efficient and Compact Speech-Driven 3D Talking Head via Personalizer-Guided Distillation☆24Mar 25, 2025Updated last year
- [ACL Findings 2026] Official Implementation of "FastKV: Decoupling of Context Reduction and KV Cache Compression for Prefill-Decoding Acc…☆32Apr 14, 2026Updated last month
- code for A Large-scale Dataset for Audio-Language Representation Learning☆14Sep 18, 2024Updated last year
- Pytorch implementation of "Oscillation-Reduced MXFP4 Training for Vision Transformers" on DeiT Model Pre-training☆40May 4, 2026Updated last month
- An open source code of the GitHub Copilot Workspace☆13Jun 8, 2024Updated 2 years ago