The official repo for paper "Spatial Speech Translation: Translating Across Space With Binaural Hearables"
☆73Aug 15, 2025Updated 8 months ago
Alternatives and similar repositories for Spatial-Speech-Translation
Users that are interested in Spatial-Speech-Translation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Core ML Demos is an experimental Core ML app. It visualizes the inference results of ML models and can be used to benchmark ML models and…☆12Jan 8, 2026Updated 3 months ago
- EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs☆47Sep 19, 2025Updated 7 months ago
- ☆14May 20, 2025Updated 11 months ago
- ☆15Apr 6, 2026Updated 3 weeks ago
- ☆64Jul 1, 2025Updated 10 months ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Project for speech bubble☆61Aug 15, 2025Updated 8 months ago
- ☆22Aug 21, 2025Updated 8 months ago
- Run DeepSeek R1 model on an Ubuntu single board computer without user registration.☆14Apr 9, 2025Updated last year
- LINEBot☆13Apr 7, 2025Updated last year
- Official implementation of the paper "MusicInfuser: Making Video Diffusion Listen and Dance" (CVPR`26)☆83Updated this week
- Official GPU implementation of the paper "PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance"☆133Nov 19, 2024Updated last year
- ☆166Nov 29, 2024Updated last year
- ☆476May 19, 2025Updated 11 months ago
- Async MCP server with Minimax API integration for image generation and text-to-speech☆50Jan 29, 2026Updated 3 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆606Oct 26, 2024Updated last year
- ☆20Jul 19, 2024Updated last year
- KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution☆384Jan 23, 2026Updated 3 months ago
- [NeurIPS'24 splotlight] Official Repo for AcoustiX used in Acoustic volume rendering for neural impulse response fields.☆37Dec 15, 2025Updated 4 months ago
- [ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…☆29Jul 24, 2025Updated 9 months ago
- Open source audio recorder and transcriber for MacOS☆79Feb 27, 2026Updated 2 months ago
- [ICCV 2025] DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness☆182Feb 11, 2026Updated 2 months ago
- ☆17Jan 31, 2023Updated 3 years ago
- This repository extends the mask editor in Comfyui and supports lasso method for applying masks☆14Jul 23, 2025Updated 9 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- AdaSkip: Adaptive Sublayer Skipping for Accelerating Long-Context LLM Inference☆20Jan 24, 2025Updated last year
- Langchain desktop app @multi-Agent☆30Jun 8, 2024Updated last year
- ☆104Apr 4, 2026Updated 3 weeks ago
- A unified robotic manipulation learning framework☆22Sep 4, 2025Updated 7 months ago
- ☆21Jul 25, 2023Updated 2 years ago
- 🕵 Code for our EMNLP 2025 Main paper: "FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games"☆25Updated this week
- [SIGGRAPH Asia 2025] CHARM: Control-point-based 3D Anime Hairstyle Auto-Regressive Modeling☆48Apr 17, 2026Updated 2 weeks ago
- Sound Event Localization and Detection using Neural Generalized Cross-Correlations☆33Feb 11, 2025Updated last year
- This is the official repository of Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities☆40Updated this week
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Official PyTorch inference code for the Interspeech 2025 paper: Efficient Speech Enhancement via Embeddings from Pre-trained Generative A…☆77Jun 16, 2025Updated 10 months ago
- [ICML 2024] Sparse Model Inversion: Efficient Inversion of Vision Transformers with Less Hallucination☆14Apr 29, 2025Updated last year
- Multi speaker audio transcription☆45Nov 25, 2022Updated 3 years ago
- Chrome extension to add a link from each Arxiv page to the corresponding HF Paper page☆26Jan 4, 2024Updated 2 years ago
- Playground that demonstrates advanced uses of Swift's Codable☆19Sep 23, 2018Updated 7 years ago
- Big Impulse Response Dataset☆158Oct 19, 2022Updated 3 years ago
- [ACL2025 Findings] Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models☆89May 20, 2025Updated 11 months ago