Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary Detection
☆97Mar 12, 2025Updated 11 months ago
Alternatives and similar repositories for Mamba-YOLO-World
Users that are interested in Mamba-YOLO-World are comparing it to the libraries listed below
Sorting:
- YOLO-UniOW: Efficient Universal Open-World Object Detection☆175Jan 17, 2025Updated last year
- Implementation of YOLO and IOU tracker in C++☆18Dec 20, 2021Updated 4 years ago
- (CVPR 2025 highlight✨) Official repository of paper "LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of La…☆561Feb 4, 2026Updated last month
- OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion☆400Mar 12, 2025Updated 11 months ago
- [ICML 2025] LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models☆17Nov 4, 2025Updated 4 months ago
- Codes for our paper "AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems"☆13Dec 13, 2024Updated last year
- ☆12Updated this week
- [ICLR 2026] Official repo for "Spotlight on Token Perception for Multimodal Reinforcement Learning"☆49Jan 30, 2026Updated last month
- [WACV 2025] Official code for our paper "Enhancing Novel Object Detection via Cooperative Foundational Models"☆84Jan 2, 2026Updated 2 months ago
- 将xfeat导出为onnx模型,且输入图片可为任意大小☆13Jul 24, 2024Updated last year
- Scripts to convert rosinality/stylegan2-pytorch and NVlabs / stylegan2-ada-pytorch to the jit traceable and cpu compatible format☆10Apr 27, 2021Updated 4 years ago
- CoV: Chain-of-View Prompting for Spatial Reasoning☆51Jan 23, 2026Updated last month
- Mixture-of-Basis-Experts for Compressing MoE-based LLMs☆29Dec 24, 2025Updated 2 months ago
- Mitigating Shortcuts in Visual Reasoning with Reinforcement Learning☆45Jul 2, 2025Updated 8 months ago
- (NeurIPS2023) CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection☆123Apr 26, 2024Updated last year
- ☆13Jul 30, 2024Updated last year
- An End-to-End Model with Adaptive Filtering for Retrieval-Augmented Generation☆16Oct 27, 2024Updated last year
- Python scripts performing Open Vocabulary Object Detection using the YOLO-World model in ONNX.☆62Apr 7, 2024Updated last year
- [CVPR 2025] DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception☆153Jan 10, 2026Updated last month
- ☆35Nov 25, 2025Updated 3 months ago
- YOLO-World-ONNX is a Python package for running inference on YOLO-WORLD Open-vocabulary-object detection model using ONNX models. It prov…☆15Feb 6, 2026Updated last month
- A novel lightweight monocular depth estimation method☆32Nov 17, 2025Updated 3 months ago
- Code for paper: Unified Text-to-Image Generation and Retrieval☆16Jul 6, 2024Updated last year
- Code and Data for "FaithfulRAG: Fact-Level Conflict Modeling for Context-Faithful Retrieval-Augmented Generation" (ACL25)☆29Oct 26, 2025Updated 4 months ago
- UniCombine: Unified Multi-Conditional Combination with Diffusion Transformer☆123Jun 27, 2025Updated 8 months ago
- Awesome OVD-OVS - A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future☆215Apr 3, 2025Updated 11 months ago
- Implementation of SmoothCache, a project aimed at speeding-up Diffusion Transformer (DiT) based GenAI models with error-guided caching.☆48Jul 17, 2025Updated 7 months ago
- Code release for "Weakly Supervised Open-Vocabulary Object Detection", AAAI2024☆35Sep 9, 2024Updated last year
- [CVPR 2024] Real-Time Open-Vocabulary Object Detection☆6,227Feb 26, 2025Updated last year
- [EMNLP 2024] Tree of Problems: Improving structured problem solving with compositionality☆19Mar 4, 2025Updated last year
- [ACL 2025] Squeezed Attention: Accelerating Long Prompt LLM Inference☆57Nov 20, 2024Updated last year
- [CVPR -2025] GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model☆132Mar 22, 2025Updated 11 months ago
- ☆52Jan 15, 2026Updated last month
- iLLaVA: An Image is Worth Fewer Than 1/3 Input Tokens in Large Multimodal Models☆21Jan 29, 2025Updated last year
- Source code of our EMNLP 2024 paper "FactAlign: Long-form Factuality Alignment of Large Language Models"☆19Oct 3, 2024Updated last year
- [ICME 2025] DiffusionTalker: Efficient and Compact Speech-Driven 3D Talking Head via Personalizer-Guided Distillation☆24Mar 25, 2025Updated 11 months ago
- ☆41Jan 10, 2025Updated last year
- TPDiff: Temporal Pyramid Video Diffusion Model☆25Mar 13, 2025Updated 11 months ago
- YOLOE: Real-Time Seeing Anything [ICCV 2025]☆2,062Jun 26, 2025Updated 8 months ago