☆11Oct 31, 2024Updated last year
Alternatives and similar repositories for VHASR
Users that are interested in VHASR are comparing it to the libraries listed below
Sorting:
- ☆39Sep 25, 2025Updated 5 months ago
- [CVPR 2025] Docopilot: Improving Multimodal Models for Document-Level Understanding☆36Jul 22, 2025Updated 7 months ago
- Official implementation of Progressive Detail Injection for Training-Free Semantic Binding in Text-to-Image Generation☆32Aug 3, 2025Updated 7 months ago
- FaceShield: Explainable Face Anti-Spoofing with Multimodal Large Language Models☆10Dec 21, 2025Updated 2 months ago
- A collection of strong multimodal models for building multimodal AGI agents☆44Jul 9, 2024Updated last year
- ☆41Apr 2, 2025Updated 11 months ago
- After creating and training the smoking detection model using YOLOv5, the next step is to deploy the model. In this project, Flask API an…☆10Mar 1, 2023Updated 3 years ago
- Accompanying code for the paper Sub-Cluster AdaCos: Learning Representations for Anomalous Sound Detection.☆10Jun 7, 2022Updated 3 years ago
- elasticsearch7.9 cdh-ext-parcels and single machine multi instance☆10Jul 12, 2021Updated 4 years ago
- ☆18Feb 16, 2025Updated last year
- Math24o: 高中奥林匹克数学竞赛测评集 High School Olympiad Mathematics Chinese Benchmark☆11Mar 27, 2025Updated 11 months ago
- ☆22Dec 11, 2025Updated 2 months ago
- ☆21Jun 16, 2025Updated 8 months ago
- Code for paper: "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"☆11Oct 11, 2024Updated last year
- ☆16Nov 11, 2025Updated 3 months ago
- Open, royalty free, lyrics2song / song generation data collection / cleaning pipeline.☆17May 9, 2025Updated 9 months ago
- Long Context Research☆26Jan 26, 2026Updated last month
- EMNLP 2024 | Style-Specific Neurons for Steering LLMs in Text Style Transfer☆13Mar 23, 2025Updated 11 months ago
- ☆14Dec 14, 2023Updated 2 years ago
- ☆97Oct 16, 2025Updated 4 months ago
- Cigarette Detection Model deployed over Django backend☆15Apr 4, 2023Updated 2 years ago
- ☆13Apr 2, 2024Updated last year
- Code for the "Long Context Needs Some R&R" paper.☆12Mar 11, 2024Updated last year
- ☆18Jun 14, 2025Updated 8 months ago
- Audio-Visual Speech Recognition☆20Jul 7, 2025Updated 7 months ago
- "Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs" 2023☆16Nov 28, 2024Updated last year
- ☆13Feb 14, 2024Updated 2 years ago
- A local search system implementation using Elasticsearch for Wikipedia data indexing and retrieval.☆12May 17, 2025Updated 9 months ago
- Official Pytorch implementation of "Omni-AVSR: Towards Unified Multimodal Speech Recognition with Large Language Models" [IEEE ICASSP 202…☆29Jan 18, 2026Updated last month
- ☆88Jul 30, 2025Updated 7 months ago
- UNCAGE: Contrastive Attention Guidance for Masked Generative Transformers in Text-to-Image Generation☆18Aug 12, 2025Updated 6 months ago
- An open-source Agent Skill framework implementing progressive disclosure architecture☆43Jan 30, 2026Updated last month
- Apache DolphinScheduler's Ambari plugin, deploy DolphinScheduler easier within Apache Ambari☆10Jan 30, 2023Updated 3 years ago
- ☆21Jul 24, 2025Updated 7 months ago
- A reproduction of RetinaFace by PaddlePaddle☆14Dec 19, 2021Updated 4 years ago
- ☆11Apr 25, 2021Updated 4 years ago
- ☆19Dec 20, 2025Updated 2 months ago
- The official code of "Beyond Walking: A Large-Scale Image-Text Benchmark for Text-based Person Anomaly Search"☆26Sep 15, 2025Updated 5 months ago
- Scene Parsing via Integrated Classification Model and Variance-Based Regularization (Matlab&Caffe), In CVPR 2019☆11Jun 11, 2019Updated 6 years ago