LLaVA inference with multiple images at once for cross-image analysis.
☆51Mar 25, 2024Updated 2 years ago
Alternatives and similar repositories for LLaVA-CLI-with-multiple-images
Users that are interested in LLaVA-CLI-with-multiple-images are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ 🎯 NeurIPS 2025 ] 3D-RAD 🩻: A Comprehensive 3D Radiology Med-VQA Dataset with Multi-Temporal Analysis and Diverse Diagnostic Tasks☆30Oct 28, 2025Updated 6 months ago
- Test-Time Adaptation of Vision-Language Models for Open-Vocabulary Semantic Segmentation☆28Sep 20, 2025Updated 7 months ago
- [ICLR 2026] Empowering Small VLMs to Think with Dynamic Memorization and Exploration☆16Mar 18, 2026Updated last month
- ☆28Feb 7, 2024Updated 2 years ago
- This sample shows how to use the oneAPI Video Processing Library (oneVPL) to perform a single and multi-source video decode and preproces…☆15Jun 15, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆13Jun 11, 2024Updated last year
- OpenVINO LLM Benchmark☆11Dec 7, 2023Updated 2 years ago
- Demo on iGPU for FFmpeg decode and scale, OpenVINO inference. this is zero-copy solution, which means No frame data copy from CPU to iGPU…☆17Jan 25, 2023Updated 3 years ago
- (CVPR 2023) Official code of MACARONS: Mapping And Coverage Anticipation with RGB ONline Self-supervision. Also contains an updated and i…☆85Dec 23, 2023Updated 2 years ago
- ☆17Dec 13, 2023Updated 2 years ago
- ☆18May 13, 2024Updated last year
- ☆10Mar 4, 2024Updated 2 years ago
- The implementation for ThreadWeaver Adaptive Threading for Efficient Parallel Reasoning in Language Models☆56Apr 8, 2026Updated 3 weeks ago
- Download Web-10K data by querying Bing Image Search☆10Feb 1, 2022Updated 4 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Spatial-Temporal Knowledge-Embedded Transformer for Video Scene Graph Generation (TIP 2024, ACM MM 2023)☆20Mar 13, 2024Updated 2 years ago
- My implement of InstantBooth☆13Sep 11, 2023Updated 2 years ago
- Pytorch implementations of Co-teaching for noisy label learning☆13Jun 28, 2022Updated 3 years ago
- Streaming Video Diffusion: Online Video Editing with Diffusion Models☆18Jun 3, 2024Updated last year
- Tools for easier OpenVINO development/debugging☆10Jul 16, 2025Updated 9 months ago
- Code for EMNLP2021 paper “Transductive Learning for Unsupervised Text Style Transfer”☆12Sep 19, 2021Updated 4 years ago
- ☆19Mar 14, 2023Updated 3 years ago
- [ICLR 2023] CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding☆46Jun 9, 2025Updated 10 months ago
- ☆11Jun 21, 2025Updated 10 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Harness for deep search agent☆85Updated this week
- ☆11Apr 7, 2026Updated 3 weeks ago
- [CVPR 2025] Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention☆67Jul 16, 2024Updated last year
- Computer vision labs for Vision and Sports Summer School 2022☆10Jul 29, 2022Updated 3 years ago
- ☆37Dec 20, 2023Updated 2 years ago
- Hierarchical Vision Transformers for Disease Progression Detection in Chest X-Ray Images☆11Jan 11, 2024Updated 2 years ago
- ☆16Mar 6, 2024Updated 2 years ago
- NeuS adapted to use multires hash encoding☆60Aug 18, 2022Updated 3 years ago
- Repo for ICCV 2021 paper: Beyond Question-Based Biases: Assessing Multimodal Shortcut Learning in Visual Question Answering☆29Jul 1, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- MokA: Multimodal Low-Rank Adaptation for MLLMs☆88Dec 30, 2025Updated 4 months ago
- Running 3D HPE at 30fps☆15Jan 3, 2022Updated 4 years ago
- Unofficial implementation for Sigmoid Loss for Language Image Pre-Training☆11Sep 26, 2023Updated 2 years ago
- ☆12Nov 22, 2022Updated 3 years ago
- official repo for paper "[CLS] Token Tells Everything Needed for Training-free Efficient MLLMs"☆22Apr 23, 2025Updated last year
- PyTorch DataLoader for many VQA datasets☆15Jan 10, 2023Updated 3 years ago
- Interactive Skeleton Based Few Shot Action Recognition☆14Nov 8, 2022Updated 3 years ago