breezedeus / Coin-CLIPLinks
Coin-CLIP: fine-tuned with a vast collection of coin images from CLIP using contrastive learning. It enhances feature extraction for coins, boosting image search accuracy. This model merges Visual Transformer (ViT) with CLIP's multimodal learning, optimized for numismatic applications.
☆23Updated last year
Alternatives and similar repositories for Coin-CLIP
Users that are interested in Coin-CLIP are comparing it to the libraries listed below
Sorting:
- Chinese CLIP models with SOTA performance.☆59Updated 2 years ago
- ☆57Updated last year
- Our 2nd-gen LMM☆34Updated last year
- Multimodal chatbot with computer vision capabilities integrated, our 1st-gen LMM☆101Updated last year
- convert paddleOCR to torchOCR, ppocr-v3,ppocr-v4, onnx, openvino☆33Updated 2 years ago
- A Simple MLLM Surpassed QwenVL-Max with OpenSource Data Only in 14B LLM.☆38Updated last year
- 研究GOT-OCR-项目落地加速,不限语言☆64Updated last year
- ICDAR 2024 Table OCR Model☆38Updated 4 months ago
- 读光中英文OCR onnx 版本模型使用 | Code for using the ONNX version of DuGuang OCR in both Chinese and English☆49Updated last week
- Contrast-guided Feature Adjustment Module for Visual Information Extraction☆30Updated 2 years ago
- ☆79Updated last year
- 安卓手机部署DeepSeek-R1 蒸馏的1.5B模型☆22Updated 9 months ago
- ☆186Updated last year
- 用于学习GOT/Qwen/OnnxLLm☆53Updated last year
- 陆续开源医疗行业的深度学习模型 及数据集☆13Updated 3 years ago
- SPRINT: Script-agnostic Structure Recognition in Tables☆14Updated 8 months ago
- segment anything model (SAM) infer by ncnn on Android mobile phone☆29Updated 2 years ago
- An unofficial Implementation of DocParser: End-to-end OCR-free Information Extraction from Visually Rich Documents☆37Updated 2 years ago
- 补充了一些Visualglm缺少的文件,可以对Visualglm进行训练,实例中是对人脸做了面相的识别☆13Updated 2 years ago
- 使用OpenCV+onnxruntime部署中文clip做以文搜图,给出一句话来描述想要的图片,就能从图库中搜出来符合要求的图片。包含C++和Python两个版本的程序☆81Updated last year
- 💡💡💡awesome compute vision app in gradio☆55Updated last year
- ☆15Updated 2 years ago
- [WACV 2026] Official implementation of the paper: “CountingDINO: A Training-free Pipeline for Exemplar-based Class-Agnostic Counting”☆43Updated 2 weeks ago
- 使用ONNXRuntime部署鲁棒性视频抠图,包含C++和Python两种版本的程序☆47Updated 4 years ago
- A light proxy solution for HuggingFace hub.☆47Updated 2 years ago
- Python scripts performing Open Vocabulary Object Detection using the YOLO-World model in ONNX.☆62Updated last year
- 视频分类标注、视频时空标注☆44Updated 2 years ago
- 利用Swin-Unet(Swin Transformer Unet)实现对文档图片里表格结构的识别,Swin-unet (Swin Transformer Unet) is used to identify the document table structure☆27Updated last year
- VimTS: A Unified Video and Image Text Spotter☆79Updated last year
- 基于TrOCR + UniMER-1M数据集,训练一个小而美的公式识别模型☆27Updated 5 months ago