The official repo for paper "Spatial Speech Translation: Translating Across Space With Binaural Hearables"
☆72Aug 15, 2025Updated 6 months ago
Alternatives and similar repositories for Spatial-Speech-Translation
Users that are interested in Spatial-Speech-Translation are comparing it to the libraries listed below
Sorting:
- ☆28Jul 31, 2025Updated 7 months ago
- ☆62Jul 1, 2025Updated 8 months ago
- ☆15Jan 12, 2026Updated last month
- Core ML Demos is an experimental Core ML app. It visualizes the inference results of ML models and can be used to benchmark ML models and…☆12Jan 8, 2026Updated 2 months ago
- ☆15Apr 11, 2024Updated last year
- EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs☆46Sep 19, 2025Updated 5 months ago
- Official GPU implementation of the paper "PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance"☆132Nov 19, 2024Updated last year
- [SIGGRAPH Asia 2025] CHARM: Control-point-based 3D Anime Hairstyle Auto-Regressive Modeling☆44Sep 26, 2025Updated 5 months ago
- ☆22Aug 21, 2025Updated 6 months ago
- 🕵 Code for our EMNLP 2025 Main paper: "FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games"☆24Dec 14, 2025Updated 2 months ago
- [ICLR 2026] Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn Search Agents☆35Updated this week
- A unified robotic manipulation learning framework☆21Sep 4, 2025Updated 6 months ago
- Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language Models☆30Oct 6, 2025Updated 5 months ago
- [EMNLP 2025] Code for paper "Table-R1: Inference-Time Scaling for Table Reasoning"☆29Jun 3, 2025Updated 9 months ago
- AdaSkip: Adaptive Sublayer Skipping for Accelerating Long-Context LLM Inference☆20Jan 24, 2025Updated last year
- [ICME 2025] DiffusionTalker: Efficient and Compact Speech-Driven 3D Talking Head via Personalizer-Guided Distillation☆24Mar 25, 2025Updated 11 months ago
- [ACL2025 Findings] Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models☆89May 20, 2025Updated 9 months ago
- ☆476May 19, 2025Updated 9 months ago
- KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution☆377Jan 23, 2026Updated last month
- ☆20Jul 19, 2024Updated last year
- [AAAI 2026] Multimodal Deepresearcher: Generating Text-Chart Interleaved Reports From Scratch with Agentic Framework☆45Jan 25, 2026Updated last month
- [ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…☆29Jul 24, 2025Updated 7 months ago
- Official Implementation of FastKV: Decoupling of Context Reduction and KV Cache Compression for Prefill-Decoding Acceleration☆30Nov 22, 2025Updated 3 months ago
- [ICCV 2025] DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness☆175Feb 11, 2026Updated 3 weeks ago
- Reproducible Language Agent Research☆34Jun 25, 2025Updated 8 months ago
- ☆29Feb 4, 2025Updated last year
- [NeurIPS 2025] Official implementation of "Reasoning Path Compression: Compressing Generation Trajectories for Efficient LLM Reasoning"☆30Oct 20, 2025Updated 4 months ago
- HyperGaussians: High-Dimensional Gaussian Splatting for High-Fidelity Animatable Face Avatars☆42Mar 1, 2026Updated last week
- A Text2SQL benchmark for evaluation of Large Language Models☆41Updated this week
- [ICCV 2025] MRGen: Segmentation Data Engine for Underrepresented MRI Modalities☆39Sep 26, 2025Updated 5 months ago
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated 8 months ago
- ☆34Nov 10, 2025Updated 4 months ago
- The Code and Script of "David's Slingshot: A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis"☆34Jun 13, 2025Updated 8 months ago
- Pytorch implementation of "Oscillation-Reduced MXFP4 Training for Vision Transformers" on DeiT Model Pre-training☆36Jun 20, 2025Updated 8 months ago
- The official implementation of "Sparse-vDiT: Unleashing the Power of Sparse Attention to Accelerate Video Diffusion Transformers" (arXiv …☆51Jun 6, 2025Updated 9 months ago
- LLM Inference with Microscaling Format☆34Nov 12, 2024Updated last year
- [ICCV 2025] Dynamic-VLM☆28Dec 16, 2024Updated last year
- Pytorch implementation of "CleanMel: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR".☆89Feb 2, 2026Updated last month
- [ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models☆35Jun 12, 2024Updated last year