opendatalab / opendatalab-datasetsLinks
datasets resource
☆127Updated 6 months ago
Alternatives and similar repositories for opendatalab-datasets
Users that are interested in opendatalab-datasets are comparing it to the libraries listed below
Sorting:
- The Open-Source Data Annotation Platform☆1,158Updated 11 months ago
- Data annotation toolbox supports image, audio and video data.☆1,467Updated 3 months ago
- ☆546Updated last year
- Data Set Description Language Specification (新一代人工智能数据集描述语言DSDL)☆47Updated last year
- 万卷1.0多模态语料☆570Updated 2 years ago
- 360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute☆305Updated last year
- Analysis of Chinese and English layouts 中英文版面分析☆260Updated 5 months ago
- SDK of OpenDataLab - https://opendatalab.org.cn☆58Updated 5 months ago
- Dingo: A Comprehensive AI Data, Model and Application Quality Evaluation Tool☆617Updated this week
- 如需体验textin文档解析,请点击https://cc.co/16YSIy☆124Updated 6 months ago
- Data annotation component library --provided as NPM packages☆145Updated 2 months ago
- PDF Parsing Tool: GOT's vLLM acceleration implementation, MinerU for layout recognition, and GOT for table formula parsing.☆65Updated last year
- ☆360Updated last year
- Alpaca Chinese Dataset -- 中文指令微调数据集☆217Updated last year
- ☆25Updated 3 years ago
- [EMNLP 2025] ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents☆619Updated last week
- 一些大语言模型和多模态模型的生态,主要包括跨模态搜索、投机解码、QAT量化、多模态量化、ChatBot、OCR☆194Updated 5 months ago
- Llama3-Tutorial(XTuner、LMDeploy、OpenCompass)☆511Updated last year
- 基于序列表格识别算法推理库,集成PP-Structure和modelscope等表格识别算法。☆407Updated 4 months ago
- A unified tool to generate fine-tuning datasets for LLMs, including questions, answers, and dialogues. ✨🤖📚💬☆62Updated 10 months ago
- An easy-to-use framework for modular RAG☆428Updated this week
- ☆73Updated last year
- This is a user guide for the MiniCPM and MiniCPM-V series of small language models (SLMs) developed by ModelBest. “面壁小钢炮” focuses on achi…☆297Updated 6 months ago
- LLM101n: Let's build a Storyteller 中文版☆138Updated last year
- 顾名思义:手搓的RAG☆131Updated last year
- A pre-built agent for TableGPT2.☆632Updated 2 weeks ago
- 整理目前开源的最优表格识别模型,完善前后处理,模型转换为ONNX | Organize the currently open-source optimal table recognition models, improve pre-processing and post-…☆915Updated 5 months ago
- 大模型预训练中文语料清洗及质量评估 Large model pre-training corpus cleaning☆74Updated last year
- 文档方向分类☆224Updated last year
- ☆101Updated 3 years ago