☆44Jan 5, 2023Updated 3 years ago
Alternatives and similar repositories for acoustic-model
Users that are interested in acoustic-model are comparing it to the libraries listed below
Sorting:
- 一个论坛类的微信小程序☆12Dec 12, 2022Updated 3 years ago
- This is the formal code implementation of the CVPR 2024 paper 'Traceable Federated Continual Learning'.☆18May 31, 2024Updated last year
- [AAAI2025] MoRe: Class Patch Attention Needs Regularization for Weakly Supervised Semantic Segmentation☆20Jan 15, 2025Updated last year
- ☆17Feb 26, 2024Updated 2 years ago
- Official implementation of the paper "ProxyDet: Synthesizing Proxy Novel Classes via Classwise Mixup for Open-Vocabulary Object Detection…☆26Feb 13, 2024Updated 2 years ago
- The official implementation of A Counting-Aware Hierarchical Decoding Framework for Generalized Referring Expression Segmentation☆23Aug 17, 2025Updated 6 months ago
- [IEEE TCSVT 2023] The implementation of our paper Semi-Supervised Subspace Clustering via Tensor Low-Rank Representation.☆25Dec 21, 2023Updated 2 years ago
- ☆22May 9, 2024Updated last year
- [CVPR'2022, TPAMI'2024] LAVT: Language-Aware Vision Transformer for Referring Segmentation☆24Jan 21, 2025Updated last year
- 微信小程序-研坛论道(信息交流/资源发布平台)☆26Apr 15, 2019Updated 6 years ago
- Official Implementation of "Open-Vocabulary Audio-Visual Semantic Segmentation" [ACM MM 2024 Oral].☆35Nov 2, 2024Updated last year
- Chat about anything on any video!☆39Sep 5, 2023Updated 2 years ago
- Official implementation of the WACV 2024 paper CLIP-DIY☆34Dec 20, 2023Updated 2 years ago
- [CVPR 2024] Narrative Action Evaluation with Prompt-Guided Multimodal Interaction☆42May 16, 2024Updated last year
- Plan, Posture and Go: Towards Open-World Text-to-Motion Generation☆42Nov 19, 2024Updated last year
- The repository contains the official implementation of "DPMesh: Exploiting Diffusion Prior for Occluded Human Mesh Recovery", CVPR 2024☆45Jun 4, 2024Updated last year
- ☆60Aug 12, 2024Updated last year
- DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding☆66Jun 10, 2025Updated 8 months ago
- Official implement of ICML2024 Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation☆57Aug 15, 2024Updated last year
- ☆59Sep 14, 2024Updated last year
- PyTorch Implementation of NACLIP in "Pay Attention to Your Neighbours: Training-Free Open-Vocabulary Semantic Segmentation"☆73Sep 23, 2024Updated last year
- [ICLR 2025] SAMRefiner: Taming Segment Anything Model for Universal Mask Refinement☆85Apr 19, 2025Updated 10 months ago
- Official Repo for PosSAM: Panoptic Open-vocabulary Segment Anything☆70Apr 7, 2024Updated last year
- [ACL2025 Findings] Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models☆89May 20, 2025Updated 9 months ago
- [ICCV 2023] CTVIS: Consistent Training for Online Video Instance Segmentation☆80Oct 15, 2023Updated 2 years ago
- [ICCV2025] AnyBimanual: Transfering Unimanual Policy for General Bimanual Manipulation☆98Jun 26, 2025Updated 8 months ago
- [NeurIPS'24]Efficient and accurate memory saving method towards W4A4 large multi-modal models.☆98Jan 3, 2025Updated last year
- UniVG-R1: Reasoning Guided Universal Visual Grounding with Reinforcement Learning☆158Jun 2, 2025Updated 9 months ago
- [ECCV2024] ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation☆111Mar 26, 2025Updated 11 months ago
- [AAAI 2024] TagCLIP: A Local-to-Global Framework to Enhance Open-Vocabulary Multi-Label Classification of CLIP Without Training☆108Jan 9, 2024Updated 2 years ago
- SegEarth-R1: Geospatial Pixel Reasoning via Large Language Model☆141Jan 21, 2026Updated last month
- A collection of papers on discrete diffusion models☆168Jun 30, 2025Updated 8 months ago
- A curated list of Multimodal Captioning related research(including image captioning, video captioning, and text captioning)☆113Jun 6, 2022Updated 3 years ago
- [CVPR24] Official Implementation of GEM (Grounding Everything Module)☆138Apr 10, 2025Updated 10 months ago
- [NeurlPS 2024] One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos☆146Dec 26, 2024Updated last year
- Summary about Video-to-Text datasets. This repository is part of the review paper *Bridging Vision and Language from the Video-to-Text Pe…☆133Oct 27, 2023Updated 2 years ago
- [ECCV 2024🔥] Official implementation of the paper "ST-LLM: Large Language Models Are Effective Temporal Learners"☆151Sep 10, 2024Updated last year
- [ICCV 2025] Code for Momentum-GS: Momentum Gaussian Self-Distillation for High-Quality Large Scene Reconstruction☆171Dec 15, 2025Updated 2 months ago
- [CVPR 2025 Oral] SegEarth-OV: Towards Training-Free Open-Vocabulary Segmentation for Remote Sensing Images☆248Jul 9, 2025Updated 8 months ago