BIGBALLON / UME-SearchView external linksLinks
Toward Universal Multimodal Embedding
☆74Aug 1, 2025Updated 6 months ago
Alternatives and similar repositories for UME-Search
Users that are interested in UME-Search are comparing it to the libraries listed below
Sorting:
- [CVPR25] CoLLM: A Large Language Model for Composed Image Retrieval☆28Mar 26, 2025Updated 10 months ago
- This repo contains the code for "VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks" [ICLR 2025]☆569Updated this week
- The top conferences on video retrieval libraries in recent years, synchronized with my blog.☆14Nov 27, 2021Updated 4 years ago
- code for the paper "CoReS: Orchestrating the Dance of Reasoning and Segmentation"☆21Nov 24, 2025Updated 2 months ago
- WeThink: Toward General-purpose Vision-Language Reasoning via Reinforcement Learning☆36Jun 10, 2025Updated 8 months ago
- Code for "CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning"☆32Mar 26, 2025Updated 10 months ago
- ☆27Dec 3, 2021Updated 4 years ago
- Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning☆41Aug 4, 2025Updated 6 months ago
- Composed Video Retrieval☆62May 2, 2024Updated last year
- LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive Learning☆75May 23, 2025Updated 8 months ago
- [CVPR 2025 Highlight] Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual Grounding☆60Aug 31, 2025Updated 5 months ago
- [ICCV 2023] - Composed Image Retrieval on Common Objects in context (CIRCO) dataset☆85Aug 6, 2025Updated 6 months ago
- A curated list of papers and resources related to Described Object Detection, Open-Vocabulary/Open-World Object Detection and Referring E…☆342Nov 6, 2025Updated 3 months ago
- Kate is Multimodal Live Assistant that ignites your browsing experience☆11Feb 15, 2025Updated last year
- The repository of VG-Refiner paper☆17Dec 9, 2025Updated 2 months ago
- ☆12Oct 12, 2020Updated 5 years ago
- ESG Insights AI simplifies ESG data analysis with advanced AI models, ensuring compliance with GRI standards. It helps asset managers ass…☆13Oct 31, 2024Updated last year
- A collection of awesome think with videos papers.☆87Dec 1, 2025Updated 2 months ago
- A repository of all code and resources of my published blog articles.☆36Dec 21, 2025Updated last month
- ☆45Oct 17, 2025Updated 3 months ago
- [CVPR 2025] LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant☆176Jul 7, 2025Updated 7 months ago
- Referring Image Segmentation Benchmarking with Segment Anything Model (SAM)☆38Apr 7, 2023Updated 2 years ago
- [CVPR2025] Official code for Lost in Translation Found in Context☆23Jan 14, 2026Updated last month
- a python lib for neural networks, file and image processing etc.☆10Feb 11, 2020Updated 6 years ago
- ☆11Sep 25, 2022Updated 3 years ago
- 用于深度哈希图像检索和深度哈希跨模态检索的性能评估算法的计算脚本☆13Oct 30, 2024Updated last year
- An image fusion techniques presented in “Poisson image editing", P. Pérez, M. Gangnet, and A. Blake, SIGGRAPH 2003.☆14Jan 13, 2020Updated 6 years ago
- ☆14Apr 14, 2025Updated 10 months ago
- [IJCAI-2024] The official code of Self-Supervised Pre-training with Symmetric Superimposition Modeling for Scene Text Recognition☆10Aug 10, 2025Updated 6 months ago
- A rules induction system for data mining and exploratory data analysis☆11Jul 17, 2024Updated last year
- [CVPR 2024] LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation☆13Jun 17, 2024Updated last year
- Adaptive and Robust Multi-Task Learning☆10May 19, 2024Updated last year
- [NeurIPS 2025 Spotlight] Transformer Copilot: Learning from The Mistake Log in LLM Fine-tuning☆15Nov 14, 2025Updated 3 months ago
- Find strongest response of convolutional layers on an image dataset. Automatically compute receptive field for any CNN layer.☆14Feb 19, 2021Updated 4 years ago
- ☆14Mar 26, 2025Updated 10 months ago
- ☆13Aug 28, 2024Updated last year
- [ECCV2024] ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference☆97Mar 26, 2025Updated 10 months ago
- Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)☆177Oct 1, 2024Updated last year
- LLM-Seg: Bridging Image Segmentation and Large Language Model Reasoning☆195Apr 16, 2024Updated last year