Official Pytorch implementation of "Omni-AVSR: Towards Unified Multimodal Speech Recognition with Large Language Models" [IEEE ICASSP 2026].
☆34Mar 10, 2026Updated last month
Alternatives and similar repositories for Omni-AVSR
Users that are interested in Omni-AVSR are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official Pytorch implementation of "Large Language Models are Strong Audio-Visual Speech Recognition Learners" [ICASSP 2025] and "Mitigat…☆62Jan 18, 2026Updated 3 months ago
- ☆13Oct 25, 2024Updated last year
- ☆64Jul 1, 2025Updated 9 months ago
- A simple gitlab/github web hooks daemon☆16Updated this week
- APAR: LLMs Can Do Auto-Parallel Auto-Regressive Decoding☆14Jul 22, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Tistory Readme Stat Card☆11Mar 27, 2024Updated 2 years ago
- ☆12Apr 12, 2026Updated last week
- Code for Learning to Learn Language from Narrated Video☆33Oct 3, 2023Updated 2 years ago
- ☆10Mar 3, 2026Updated last month
- PyTorch unoffical implementation of "PoE-GAN : Multimodal Conditional Image Synthesis with Product-of-Experts GANs"☆14Mar 29, 2023Updated 3 years ago
- Mad Square's Brawl is the 2D Android Platformer PVP game.☆17Feb 15, 2023Updated 3 years ago
- ☆99Feb 4, 2026Updated 2 months ago
- [CVPR2025] Official code for Lost in Translation Found in Context☆23Jan 14, 2026Updated 3 months ago
- DO with Terraform and Ansible☆11Jun 5, 2018Updated 7 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆11Nov 17, 2018Updated 7 years ago
- ☆10Oct 24, 2024Updated last year
- A Wordle game written in Rust, refined. Play in browser with the power of WebAssembly! Course project of Programming Training, Tsinghua U…☆17Jul 10, 2024Updated last year
- Extensive Self-Contrast Enables Feedback-Free Language Model Alignment☆20Apr 2, 2024Updated 2 years ago
- Support library for the MaskRCNN masks extracted on EPIC-KITCHENS-100☆14Dec 1, 2020Updated 5 years ago
- A minimal java desktop app with awesome UI based on Swing to drag and drop files programmatically.☆24Jan 19, 2018Updated 8 years ago
- Official code for the paper "Understanding Co-speech Gestures in-the-wild"☆24Oct 31, 2025Updated 5 months ago
- A framework for building speech-enabled websites.☆10Jul 10, 2015Updated 10 years ago
- 누군가를 위해 선물을 고를 때 고민을 줄여 줄 수 있게 도와주는 앱을 만듭니다.☆22Jan 26, 2022Updated 4 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Unsupervised Multi-object Segmentation by Predicting Probable Motion Patterns☆17Nov 15, 2022Updated 3 years ago
- An experimental modular OS written in Rust.☆17Feb 11, 2025Updated last year
- Speech2Action CVPR Poster Source Code☆20Apr 29, 2020Updated 5 years ago
- Research code for "Towards multi-task learning of speech and speaker recognition" at https://arxiv.org/pdf/2302.12773.pdf☆12Dec 2, 2024Updated last year
- ☆12Apr 26, 2025Updated 11 months ago
- Openreviewers: Multi Agent Academic Review Simulation System☆23Mar 2, 2024Updated 2 years ago
- VoxSRC2022 workshop development kit☆19Jul 21, 2022Updated 3 years ago
- Code for Deep Multimodal Clustering for Unsupervised Audiovisual Learning (CVPR2019)☆15May 27, 2020Updated 5 years ago
- ☆12Mar 12, 2023Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆17May 16, 2025Updated 11 months ago
- Evaluation script for VoxMovies dataset in PyTorch☆23Jan 12, 2024Updated 2 years ago
- Chinese–English Stopword List (3,076 entries, including special symbols)☆22Jan 7, 2026Updated 3 months ago
- FIBO-Edit brings the power of structured prompt generation to image editing☆39Jan 29, 2026Updated 2 months ago
- Implementation of "With a Little Help from my Temporal Context: Multimodal Egocentric Action Recognition, BMVC, 2021" in PyTorch☆20Dec 16, 2021Updated 4 years ago
- >>> 异常中断 + 虚存页表 + 分支预测 + TLB + Cache + Flash + VGA + uCore☆20Nov 17, 2023Updated 2 years ago
- Official implementation of RAVEn (ICLR 2023) and BRAVEn (ICASSP 2024)☆81Feb 27, 2025Updated last year