A Python package for interacting with the MinerU Vision-Language Model.
☆109Apr 8, 2026Updated this week
Alternatives and similar repositories for mineru-vl-utils
Users that are interested in mineru-vl-utils are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- WanJuan-CC是以CommonCrawl为基础,经过数据抽取,规则清洗,去重,安全过滤,质量清洗等步骤得到的高质量数据。☆13Apr 18, 2024Updated last year
- 阅读顺序、Layoutreader☆19May 8, 2025Updated 11 months ago
- Data annotation component library --provided as NPM packages☆147Mar 18, 2026Updated 3 weeks ago
- ☆14Apr 19, 2024Updated last year
- Data Set Description Language Specification (新一代人工智能数据集描述语言DSDL)☆46May 29, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Large-Scale High-quality Chinese Web Text with Multi-dimensional and fine-grained information☆38Dec 2, 2024Updated last year
- 秘塔AI搜索 Python SDK https://metaso.cn☆15Apr 21, 2025Updated 11 months ago
- DELT: Data Efficacy for Language Model Training☆45Feb 12, 2026Updated last month
- ☆121Jan 15, 2026Updated 2 months ago
- Diffusion Model Improvement Method☆35Sep 4, 2023Updated 2 years ago
- Preview markdown files in yazi with mdcat☆10Apr 24, 2025Updated 11 months ago
- PHP with FPM Dockerfile for trusted automated Docker builds.☆12Mar 2, 2016Updated 10 years ago
- MinerU-HTML: An SLM-powered HTML main content extractor that outputs clean HTML bodies. Perfect for Deep Research Agents, RAG application…☆229Mar 27, 2026Updated last week
- Use to store public paper and organize them.☆18Feb 26, 2021Updated 5 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- CURRENNT -- CUDA-enabled machine learning library for recurrent neural network☆16Feb 20, 2020Updated 6 years ago
- Implementation of research paper "Deep Splitting and Merging for Table Structure Decomposition"☆61Nov 9, 2022Updated 3 years ago
- A helper package to get information of scholarly articles from DBLP using its public API☆15May 13, 2025Updated 10 months ago
- 语音合成VITS 纯中文微调☆12Mar 15, 2023Updated 3 years ago
- BERT&RoBERTa预训练代码,tensorflow和torch两种版本实现☆13Feb 8, 2023Updated 3 years ago
- A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.☆320Aug 15, 2025Updated 7 months ago
- This repository contains the code for the Transformer-Representation Neural Topic Model (TNTM) based on the paper "Probabilistic Topic Mo…☆12Jul 6, 2024Updated last year
- ☆16Apr 30, 2025Updated 11 months ago
- cpp inference for EmotiVoice☆16Jan 1, 2024Updated 2 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- 用于生成文本纠错模型(如Gector)需要的大量数据。☆14Jan 5, 2023Updated 3 years ago
- 同花顺算法挑战平台:【9-10双月赛】跨领域迁移的文本语义匹配☆11Oct 28, 2021Updated 4 years ago
- Ongoing research project for code&math LLMs☆29Jul 4, 2025Updated 9 months ago
- 中文关键词提取☆14Aug 7, 2023Updated 2 years ago
- All Digital Phase-Locked Loop☆13May 22, 2023Updated 2 years ago
- Forcing Diffuse Distributions out of Language Models☆18Sep 10, 2024Updated last year
- ☆25Nov 7, 2022Updated 3 years ago
- Youtu-Parsing: Perception, Structuring and Recognition via High-Parallelism Decoding☆64Feb 10, 2026Updated 2 months ago
- CycleCenternet based on MMDetection☆22Jun 28, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- NPUEval is an LLM evaluation dataset written specifically to target AIE kernel code generation on RyzenAI hardware.☆30Nov 8, 2025Updated 5 months ago
- The official repository for Multi3WOZ: A Multilingual, Multi-Domain, Multi-Parallel Dataset for Training and Evaluating Culturally Adapte…☆17Jan 15, 2024Updated 2 years ago
- The official implement of CTRNet++.☆15Dec 30, 2024Updated last year
- Collection of papers, benchmarks and newest trends in the domain of End-to-end ToDs☆14Nov 18, 2023Updated 2 years ago
- The Open-Source Data Annotation Platform☆1,207Feb 19, 2025Updated last year
- 对抗训练在NLP中的应用☆14Nov 22, 2021Updated 4 years ago
- ☆21Jun 1, 2025Updated 10 months ago