The official repo for paper "Spatial Speech Translation: Translating Across Space With Binaural Hearables"
☆74Aug 15, 2025Updated 9 months ago
Alternatives and similar repositories for Spatial-Speech-Translation
Users that are interested in Spatial-Speech-Translation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆31Jul 31, 2025Updated 9 months ago
- Core ML Demos is an experimental Core ML app. It visualizes the inference results of ML models and can be used to benchmark ML models and…☆12Jan 8, 2026Updated 4 months ago
- EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs☆47Sep 19, 2025Updated 8 months ago
- Microphone Array Real-time System☆13Jun 7, 2017Updated 8 years ago
- ☆15Apr 6, 2026Updated last month
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆64Jul 1, 2025Updated 10 months ago
- ☆22Aug 21, 2025Updated 9 months ago
- ☆21Jul 15, 2024Updated last year
- Official implementation of the paper "MusicInfuser: Making Video Diffusion Listen and Dance" (CVPR`26)☆83May 3, 2026Updated 2 weeks ago
- Official GPU implementation of the paper "PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance"☆134Nov 19, 2024Updated last year
- ☆166Nov 29, 2024Updated last year
- ☆478May 19, 2025Updated last year
- ☆606Oct 26, 2024Updated last year
- ☆20Jul 19, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- [ICCV 2025] DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness☆183Feb 11, 2026Updated 3 months ago
- ☆17Jan 31, 2023Updated 3 years ago
- This repository extends the mask editor in Comfyui and supports lasso method for applying masks☆14Jul 23, 2025Updated 9 months ago
- Implementation of "Look, Listen and Recognise:character-aware audio-visual subtitling"☆20Nov 3, 2025Updated 6 months ago
- ☆106Apr 4, 2026Updated last month
- A unified robotic manipulation learning framework☆22Sep 4, 2025Updated 8 months ago
- 🕵 Code for our EMNLP 2025 Main paper: "FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games"☆25Apr 26, 2026Updated 3 weeks ago
- [SIGGRAPH Asia 2025] CHARM: Control-point-based 3D Anime Hairstyle Auto-Regressive Modeling☆49Apr 17, 2026Updated last month
- Sound Event Localization and Detection using Neural Generalized Cross-Correlations☆34Feb 11, 2025Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- This is the official repository of Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities☆41Apr 28, 2026Updated 3 weeks ago
- Official PyTorch inference code for the Interspeech 2025 paper: Efficient Speech Enhancement via Embeddings from Pre-trained Generative A…☆79Jun 16, 2025Updated 11 months ago
- Big Impulse Response Dataset☆159Oct 19, 2022Updated 3 years ago
- [ACL2025 Findings] Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models☆89May 20, 2025Updated last year
- StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.☆1,267Jun 29, 2025Updated 10 months ago
- [ACL Findings 2026] Official Implementation of "FastKV: Decoupling of Context Reduction and KV Cache Compression for Prefill-Decoding Acc…☆32Apr 14, 2026Updated last month
- [ICME 2025] DiffusionTalker: Efficient and Compact Speech-Driven 3D Talking Head via Personalizer-Guided Distillation☆24Mar 25, 2025Updated last year
- VKriez Edge Preprocessors nodes for ComfyUI☆16Mar 18, 2025Updated last year
- 已迁移到👇这个仓库☆47Aug 29, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- An open source and enterprise-grade implementation of the orchestrator-worker pattern from Anthropic's paper, "How we built our multi-age…☆29Oct 9, 2025Updated 7 months ago
- 借助cloudflare tunnel实现在容器平台的frp内网穿透☆49Apr 17, 2025Updated last year
- [ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models☆35Jun 12, 2024Updated last year
- [CVPR 2025 Highlight] FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation☆29Jun 16, 2025Updated 11 months ago
- KV Cache Steering for Inducing Reasoning in Small Language Models☆49Jul 24, 2025Updated 9 months ago
- 智能视频处理系统☆47Dec 26, 2024Updated last year
- ComfyUI implementation of RAVE https://rave-video.github.io/☆95May 22, 2024Updated last year