本项目使用LLaVA 1.6多模态模型实现以文搜图和以图搜图功能。
☆28Feb 26, 2024Updated 2 years ago
Alternatives and similar repositories for multi-modal-image-search
Users that are interested in multi-modal-image-search are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Using OpenVINO to speed up MeloTTS inference☆15Nov 1, 2024Updated last year
- Guide to deploying deep-learning inference networks and deep vision primitives on SOPHON TPU.☆20Nov 14, 2025Updated 5 months ago
- [NAACL 2025🔥] MEDA: Dynamic KV Cache Allocation for Efficient Multimodal Long-Context Inference☆20Jun 19, 2025Updated 10 months ago
- 在RAG技术中,嵌入向量的生成和匹配是关键环节。本文介绍了一种基于CLIP/BLIP模型的嵌入服务,该服务支持文本和图像的嵌入生成与相似度计算,为多模态信息检索提供了基础能力。☆42Dec 28, 2024Updated last year
- 本项目用于文档问答,使用向量嵌入 + ES 做召回,使用Rerank模型作为精排,再使用LLM做文档问答,Web框架使用Flask。☆34Mar 17, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- 基于ChatGLM3-6b的智能对话系统,集成了RAG、知识图谱、Agent、多模态等技术来增强大模型的回复质量。☆68Aug 12, 2024Updated last year
- 用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.☆17Feb 29, 2024Updated 2 years ago
- 通用数字人系统是一个基于深度学习和WebRTC技术的智能交互平台,集成了Azure Avatar数字人渲染、语音识别合成、自然语言处理等技术。系统支持实时对话、知识问答和情感交互,可实现30FPS以上的流畅渲染和200ms以内的低延迟响应。核心功能包括基于GPT的智能对话、…☆31Dec 17, 2025Updated 4 months ago
- [NeurIPS 2025] Official Implementation of ViSpec: Accelerating Vision-Language Models with Vision-Aware Speculative Decoding.☆53Jan 28, 2026Updated 3 months ago
- Our 2nd-gen LMM☆34May 22, 2024Updated last year
- [Recsys'2023] "RCL: Multi-Relational Contrastive Learning for Recommendation"☆16Sep 6, 2023Updated 2 years ago
- StreamlitとLangGraphで実装したHuman-in-the-loop広告コピー文生成アプリケーション☆11Feb 15, 2025Updated last year
- ECCV 2022, MonoPLFlowNet☆10Jun 14, 2024Updated last year
- EvalDNN: A Toolbox for Evaluating Deep Neural Network Models☆14Mar 9, 2020Updated 6 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- 使用BERT预训练语言模型获取评论文本的向量表示,通过Bi-GRU网络学习其中的语义特征,分别采用情感权重和注意力机制来为评论向量分配权重,动态调节其对用户特征和产品特征的影响程度,并以加权求和的方式获得用户特征和产品特征,最后利用DeepFM算法对用户特征和产 品特征进行深…☆16Mar 28, 2023Updated 3 years ago
- Efficient Token-Guided Image-Text Retrieval with Consistent Multimodal Contrastive Training☆30Jun 20, 2023Updated 2 years ago
- GPT Table Semantic Parsing with complex & non-intuitive structure.☆17Jul 16, 2025Updated 9 months ago
- 在index-tts-vllm的基础上,实现了并提供了模拟流式合成音频的接口服务及客户端测试脚本☆26Sep 2, 2025Updated 8 months ago
- ☆11May 8, 2020Updated 5 years ago
- [NAACL 2024] Topics, Authors, and Institutions in Large Language Model Research: Trends from 17K arXiv Papers https://arxiv.org/abs/2307.…☆17Jan 27, 2024Updated 2 years ago
- 毕业设计项目(基于opencv车牌识别的停车场收费系统)☆12Jul 16, 2022Updated 3 years ago
- 集成了LLM与SDXL的AIGC应用程序☆29Jan 3, 2024Updated 2 years ago
- Happy Hacking With Claude!!!☆25Oct 27, 2025Updated 6 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- About Codes for ACL 2023 paper: Exploiting! Multimodal Relation Extraction with Feature Denoising and Multimodal Topic Modeling.☆20Jun 25, 2024Updated last year
- PyTorch implementation of "A Simple Baseline for Low-Budget Active Learning".☆14Dec 22, 2021Updated 4 years ago
- Awesome Self-Supervised Vision Learning☆11Mar 27, 2024Updated 2 years ago
- AI驱动的虚拟数字人直播系统,支持2D/3D数字人、TTS、ASR、唇形同步、推流、互动等模块化开发。☆24May 13, 2025Updated 11 months ago
- 基于爬虫与AI技术的京东商品评论自动化分析系统☆20Dec 27, 2024Updated last year
- [ECCV 2022] "TALISMAN: Targeted Active Learning for Object Detection with Rare Classes and Slices using Submodular Mutual Information" by…☆10Sep 21, 2022Updated 3 years ago
- This project involves using Large Language Models (LLM) for efficient mobile robot path planning. It integrates AI techniques for real-ti…☆29Jan 24, 2026Updated 3 months ago
- ☆15Jan 15, 2024Updated 2 years ago
- [DeepRead] This is the official implementation of the DeepRead paper.☆45Updated this week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [ICML2025] The official implementation of "C-3PO: Compact Plug-and-Play Proxy Optimization to Achieve Human-like Retrieval-Augmented Gene…☆44May 3, 2025Updated last year
- ☆14Jun 5, 2024Updated last year
- [ACL2026 Findings] "Towards Hierarchical Multi-Step Reward Models for Enhanced Reasoning in Large Language Models"☆20Mar 25, 2025Updated last year
- A Simple Framwork for CV Pre-training Model (SOCO, VirTex, BEiT)☆15Oct 18, 2021Updated 4 years ago
- Official implementation of paper ReTaKe: Reducing Temporal and Knowledge Redundancy for Long Video Understanding☆40Mar 16, 2025Updated last year
- 私有化自动数字人排队训练、短视频排队生成的微信小程序、web运营后台管理系统一键部署,基于单人训练的音频驱动唇形,比wav2lip、deepfacelab、liveportrait、musetalk等等唇形方案更好,直接可以商业化,支持中日英韩多种语音复刻☆59Apr 14, 2025Updated last year
- Automatic defect recognition in X-ray testing using computer vision☆13Dec 8, 2018Updated 7 years ago