以Qwen2-VL作为基座多模态大模型,通过指令微调的方式实现特定场景下的OCR,用于学习多模态LLM微调
☆25Jan 18, 2025Updated last year
Alternatives and similar repositories for Qwen2-VL-LaTex_OCR
Users that are interested in Qwen2-VL-LaTex_OCR are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- deploy onnx models with TensorRT and LibTorch☆19Nov 17, 2021Updated 4 years ago
- 个人学习的医疗大模型微调项目☆31Dec 3, 2025Updated 5 months ago
- 在千问最新的多模态image-text模型Qwen3-VL-4B-Instruct 进行多种lora微调对比效果,通过langchain+RAG+多智能体(Multi-Agent)进行部署☆45Dec 14, 2025Updated 5 months ago
- 本项目从零开始构建并优化了一个千万参数级别的大规模预训练语言模型,涵盖预训练、有监督微调(SFT)和R1推理蒸馏三个阶段。项目采用自定义Transformer架构(包括RMSNorm、分组注意力、多Query机制、SwiGLU激活和RoPE位置编码),实现高效的长文本处理和…☆22Mar 10, 2025Updated last year
- BMInf demos.☆16Oct 14, 2021Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- 本仓库旨在记录和分享我在 LLM 和 Agent 领域的学习历程,并通过实践项目深入理解相关技术。通过从零开始构建基于 LLM 和 Agent 的应用,学习LLM原理和Agent开发经验。☆27Mar 28, 2025Updated last year
- 深度网络实现意图分类。☆11Feb 26, 2021Updated 5 years ago
- 最基本最小白的自然语言处理入门读物,基于deepseek-r1,涵盖了传统NLP和现代大模型☆27Jan 16, 2026Updated 4 months ago
- 诺亚健康信息管理系统☆12Dec 20, 2017Updated 8 years ago
- Project page for the ICDAR 2023 Paper "Inv3D: a high-resolution 3D invoice dataset for template-guided single-image document unwarping".☆13Dec 21, 2023Updated 2 years ago
- Packaged TResNet based on Official PyTorch Implementation☆15Oct 26, 2020Updated 5 years ago
- ☆22Mar 11, 2025Updated last year
- Chatbot implementation using ChatGPT API and Gradio.☆14Mar 2, 2023Updated 3 years ago
- ☆25Apr 16, 2021Updated 5 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆30Sep 2, 2019Updated 6 years ago
- 快手极速版☆14May 18, 2022Updated 4 years ago
- 面试辅助系统是一个基于AI的工具,可以将面试官的音频实时转换为文字,并提供合适的回答。支持知识库方案。☆28Mar 25, 2025Updated last year
- Conditional VAE using CNN on MNIST in PyTorch☆20May 24, 2020Updated 5 years ago
- ☆15Feb 28, 2022Updated 4 years ago
- Modeling Stroke Mask for End-to-End Text Erasing☆19Feb 9, 2023Updated 3 years ago
- 新浪微博签到、电信手机签到、快手极速版签到、爱奇艺签到、wps签到 / 快手刷视频,今日头条刷金币、百度极速版刷金币☆26Mar 4, 2021Updated 5 years ago
- [ECCV 2024] SAGS: Structure-Aware 3D Gaussian Splatting☆38Jul 9, 2025Updated 10 months ago
- This is a TensorFlow implementation of SSH: Single Stage Headless Face Detector☆32Aug 11, 2019Updated 6 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- A re-implementation of PFLD, https://arxiv.org/abs/1902.10859☆45Aug 27, 2019Updated 6 years ago
- Template based form extractor OCR. Train your own character and alphabet OCR.☆18Oct 22, 2018Updated 7 years ago
- 2021 搜狐校园文本匹配算法大赛方案☆19Nov 7, 2024Updated last year
- SOTA Document Image Enhancement - T2T-BinFormer: Effective Document Image Enhancement Using tokens-to-token Transformer Network☆24Dec 9, 2023Updated 2 years ago
- vans-web是与vans搭配的前端部分,vue+elementui为基础搭建的前端脚手架部分。☆41Jan 30, 2018Updated 8 years ago
- 基于python-opencv的车牌识别demo(参考:https://blog.csdn.net/weixin_41695564/article/details/79712393进行了修改)☆21Nov 25, 2021Updated 4 years ago
- 主干网络替换为了改进的ResNet50☆25Nov 4, 2023Updated 2 years ago
- 竞争性自适应重加权采样法(competitive adapative reweighted sampling, CARS)python代码☆25Apr 20, 2022Updated 4 years ago
- Code from our paper "Template-guided Illumination Correction for Document Images with Imperfect Geometric Reconstruction " (ICCVW) 2023.☆28Feb 7, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Project on the assignment of ICD codes to medical/clinical text☆22Jun 12, 2023Updated 2 years ago
- Face evaluation method, such as FDDB, WIDERFace, Megaface, etc.☆48Apr 24, 2018Updated 8 years ago
- 私のブログサイトのソースコードは、Javaで開発されており、SpringMVCフレームワークとMySQLデータベースを使用しています。☆24Feb 27, 2025Updated last year
- 训练一个对中文支持更好的LLaVA模型,并开源训练代码和数据。☆82Sep 6, 2024Updated last year
- Video explanation for GPT2-chitchat in detail / 中文闲聊的GPT2模型(GPT2-chitchat)代码视频详解☆27Jul 6, 2023Updated 2 years ago
- Code for the paper "UVDoc: Neural Grid-based Document Unwarping" - Dataset capture and creation☆32May 27, 2024Updated last year
- Baselines for CAIL2020-Argument-Mining: BERT, RNN☆40May 29, 2020Updated 5 years ago