sugarandgugu / Text2Image-RetrievalLinks
计算机视觉课程设计-基于Chinese-CLIP的图文检索系统
☆93Updated 2 years ago
Alternatives and similar repositories for Text2Image-Retrieval
Users that are interested in Text2Image-Retrieval are comparing it to the libraries listed below
Sorting:
- Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conv…☆467Updated 5 months ago
- 该项目旨在通过输入文本描述来检索与之相匹配的图片。☆41Updated last year
- Learning Semantic Relationship among Instances for Image-Text Matching, CVPR, 2023☆90Updated 3 months ago
- 最容易上手的0门槛 chatglm3 & agent & langchain 项目☆223Updated last year
- 中文CLIP:自定义数据集,可根据文图提取向量,实现文图匹配。☆22Updated 2 years ago
- 基于多模态检索的互联网 图文匹配☆14Updated last year
- Linguistic-Aware Patch Slimming Framework for Fine-grained Cross-Modal Alignment, CVPR, 2024☆97Updated last month
- An LLM-based tool to chat with your documents and databases, including a management system | 面向企业内部环境的大模型(LLM)知识库问答系统,包含后台管理系统☆107Updated 2 years ago
- Code for AAAl 2024 paper: Relax Image-Specific Prompt Requirement in SAM: A Single Generic Prompt for Segmenting Camouflaged Objects☆154Updated 5 months ago
- 目标检测,采用yolov8作为基准模型,数据集采用VisDrone2019,带有自己的改进策略☆99Updated last year
- A collection of multimodal reasoning papers, codes, datasets, benchmarks and resources.☆283Updated this week
- Multi-Modal learning toolkit based on PaddlePaddle and PyTorch, supporting multiple applications such as multi-modal classification, cros…☆477Updated 2 years ago
- Chinese large language model☆122Updated 2 years ago
- Build a simple basic multimodal large model from scratch. 从零搭建一个简单的基础多模态大模型🤖☆44Updated last year
- bert、roberta、ernie等方法进行文本分类☆88Updated 2 years ago
- 使用pytorch完成的一个多模态分类任务,文本和图像部分分别使用了bert和resnet提取特征(在config里可以组合多种模型),在我的小规模数据集上取得了良好的性能(验证集acc96%)☆80Updated 2 years ago
- Medical Multimodal LLMs☆330Updated 3 months ago
- 训练一个对中文支持更好的LLaVA模型,并开源训练代码和数据。☆64Updated 11 months ago
- 一些大语言模型和多模态模型的生态,主要包括跨模态搜索、投机解码、QAT量化、多模态量化、ChatBot、OCR☆186Updated 2 weeks ago
- 模型 llava-Qwen2-7B-Instruct-Chinese-CLIP 增强中文文字识别能力和表情包内涵识别能力,接近gpt4o、claude-3.5-sonnet的识别水平!☆24Updated last year
- WWW2025 Multimodal Intent Recognition for Dialogue Systems Challenge☆122Updated 9 months ago
- 2024.06.19 本项目使用Chinese-CLIP搭建文搜图/图搜图页面,旨在帮助用户快速使用跨模态检索任务。本项目代码针对MUGE数据集约19w(189585张)数据作为底库数据。本项目提供了提取特征, 检索, 以及uI代码。☆18Updated last year
- RASA中文任务型机器人☆103Updated 9 months ago
- [ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization☆579Updated last year
- [EMNLP 2023] FreeAL: Towards Human-Free Active Learning in the Era of Large Language Models☆91Updated last year
- Multi-Branch Auxiliary Fusion YOLO with Re-parameterization Heterogeneous Convolutional for accurate object detection.☆100Updated 8 months ago
- 基于200万条医疗数据对DeepSeek-R1-Distill-Qwen-32B进行fine tune且部署☆157Updated 5 months ago
- 基于qwenvl微调一个多模态Xray识别的大模型☆21Updated 9 months ago
- YiJian-Comunity: a full-process automated large model safety evaluation tool designed for academic research☆115Updated 10 months ago
- 【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?☆241Updated 8 months ago