opendatalab / opendatalab-datasets
datasets resource
☆110Updated last month
Alternatives and similar repositories for opendatalab-datasets:
Users that are interested in opendatalab-datasets are comparing it to the libraries listed below
- Data Set Description Language Specification (新一代人工智能数据集描述语言DSDL)☆47Updated 10 months ago
- Data annotation component library --provided as NPM packages☆87Updated 2 weeks ago
- SDK of OpenDataLab - https://opendatalab.org.cn☆57Updated last year
- ☆25Updated 2 years ago
- The Open-Source Data Annotation Platform☆788Updated 2 months ago
- 万卷1.0多模态语料☆558Updated last year
- Data annotation toolbox supports image, audio and video data.☆1,164Updated this week
- ☆481Updated 8 months ago
- AAAI 2024: Visual Instruction Generation and Correction☆92Updated last year
- Dingo: A Comprehensive Data Quality Evaluation Tool☆130Updated last week
- ☆16Updated this week
- GOT的vLLM加速实现 并结合 MinerU 实现RAG中的pdf 解析☆55Updated 5 months ago
- [ACL2024 Findings] Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models☆346Updated last year
- Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent☆300Updated last month
- ☆322Updated 10 months ago
- 顾名思义:手搓的RAG☆121Updated last year
- MPB (Miner-PDF-Benchmark) is an end-to-end PDF document comprehension evaluation suite designed for large-scale model data scenarios.☆22Updated 4 months ago
- A unified tool to generate fine-tuning datasets for LLMs, including questions, answers, and dialogues. ✨🤖📚💬☆59Updated last month
- ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents☆452Updated last month
- 利用免费的大模型api来结合你的私域数据来生成sft训练数据(妥妥白嫖)支持llamafactory等工具的训练数据格式synthetic data☆155Updated 5 months ago
- 探索 LLM 在法律行业的应用潜力☆87Updated 4 months ago
- UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition☆310Updated last month
- Enhance LLM agents with rich tool APIs☆384Updated 7 months ago
- LLM Group Chat Framework: chat with multiple LLMs at the same time. 大模型群聊框架:同时与多个大语言模型聊天。☆293Updated last year
- 基于《西游记》原文、白话文、ChatGPT生成数据制作的,以InternLM2微调的角色扮演多LLM聊天室。 本项目将介绍关于角色扮演类 LLM 的一切,从数据获取、数据处理,到使用 XTuner 微调并部署至 OpenXLab,再到使用 LMDeploy 部署,以 op…☆98Updated last year
- 360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute☆278Updated 7 months ago
- 中文原生检索增强生成测评基准☆115Updated last year
- ☆59Updated last year
- ☆49Updated last year
- Analysis of Chinese and English layouts 中英文版面分析☆201Updated 3 weeks ago