thu-ml / zh-clipLinks

☆72

Alternatives and similar repositories for zh-clip

Users that are interested in zh-clip are comparing it to the libraries listed below

Sorting:

TencentARC-QQ / QA-CLIP
Chinese CLIP models with SOTA performance.
☆59Updated 2 years ago
360CVGroup / SEEChat
Multimodal chatbot with computer vision capabilities integrated, our 1st-gen LMM
☆101Updated last year
ksOAn6g5 / TaiSu
TaiSu（太素）--a large-scale Chinese multimodal dataset（亿级大规模中文视觉语言预训练数据集）
☆191Updated last year
alipay / Ant-Multi-Modal-Framework
Research Code for Multimodal-Cognition Team in Ant Group
☆168Updated last week
X-PLUG / Youku-mPLUG
Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Pre-training Dataset and Benchmarks
☆301Updated last year
mynameischaos / Lion
Lion: Kindling Vision Intelligence within Large Language Models
☆51Updated last year
scenarios / WeMM
☆87Updated last year
opendatalab / laion5b-downloader
☆116Updated 2 years ago
yuxie11 / R2D2
☆167Updated last year
pleisto / yuren-baichuan-7b
基于baichuan-7b的开源多模态大语言模型
☆72Updated last year
friedrichor / UNITE
official code for "Modality Curation: Building Universal Embeddings for Advanced Multimodal Information Retrieval"
☆35Updated 3 months ago
QQ-MM / QQMM-embed
☆21Updated last week
BIGBALLON / UME-Search
Toward Universal Multimodal Embedding
☆62Updated 2 months ago
MUGE-2021 / image-generation-baseline
☆32Updated 3 years ago
xverse-ai / XVERSE-V-13B
☆79Updated last year
360CVGroup / 360VL
Our 2nd-gen LMM
☆34Updated last year
WePOINTS / WePOINTS
☆186Updated 8 months ago
large-ocr-model / large-ocr-model.github.io
☆184Updated last year
will-singularity / Skywork-MM
Empirical Study Towards Building An Effective Multi-Modal Large Language Model
☆22Updated 2 years ago
CuriseJia / ECCV24-FreeStyleRet
Precision Search through Multi-Style Inputs
☆72Updated 2 months ago
MonolithFoundation / Bumblebee
A Simple MLLM Surpassed QwenVL-Max with OpenSource Data Only in 14B LLM.
☆38Updated last year
billjie1 / Chinese-CLIP
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
☆169Updated 2 years ago
yuyq96 / TextHawk
Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models
☆63Updated 11 months ago
yangjianxin1 / OFA-Chinese
transformers结构的中文OFA模型
☆135Updated 2 years ago
VectorSpaceLab / MegaPairs
[ACL 2025 Oral] 🔥🔥 MegaPairs: Massive Data Synthesis for Universal Multimodal Retrieval
☆228Updated 5 months ago
Ucas-HaoranWei / Vary-family
☆57Updated last year
ForeverPs / IncrementalVHD_GPE
official code for paper: Exploring Domain Incremental Video Highlights Detection with the LiveFood Benchmark
☆39Updated last year
applenob / clip_chinese_text_encoder
CLIP中文encoder
☆22Updated 3 years ago
PCIResearch / TransCore-M
Large Multimodal Model
☆15Updated last year
rednote-hilab / dots.vlm1
The official repository of the dots.vlm1 instruct models proposed by rednote-hilab.
☆261Updated last month