datasets resource
☆144May 27, 2026Updated last week
Alternatives and similar repositories for opendatalab-datasets
Users that are interested in opendatalab-datasets are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆25Nov 7, 2022Updated 3 years ago
- AAAI 2024: Visual Instruction Generation and Correction☆97Feb 4, 2024Updated 2 years ago
- WanJuan-CC是以CommonCrawl为基础,经过数据抽取,规则清洗,去重,安全过滤,质量清洗等步骤得到的高质量数据。☆14Apr 18, 2024Updated 2 years ago
- Open-source multimodal data annotation platform with AI auto-annotation support.☆1,583Updated this week
- MLLM-DataEngine: An Iterative Refinement Approach for MLLM☆48May 24, 2024Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- The Open-Source Data Annotation Platform☆1,235Feb 19, 2025Updated last year
- Out-of-the-box Annotation Toolbox☆395Apr 19, 2024Updated 2 years ago
- WanJuan3.0(“万卷·丝路”)一个作为综合性的纯文本语料库,采集了多个国家地区的网络公开信息、文献、专利等资料,数据总规模超1.2TB,Token总数超过300B,处于国际领先水平,首期开源的语料库主要由泰语、俄语、阿拉伯语、韩语和越南语5个子集构成,每个子集的数据…☆46Feb 13, 2025Updated last year
- UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition☆482Sep 28, 2025Updated 8 months ago
- (ICCV 2025) OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation☆102Dec 3, 2025Updated 6 months ago
- A Python package for interacting with the MinerU Vision-Language Model.☆128May 28, 2026Updated last week
- NanaDraw turns complex scientific ideas into clear, expressive visuals you can use right away. Powered by Nano Banana, it generates edita…☆103Apr 29, 2026Updated last month
- A Comprehensive Toolkit for High-Quality PDF Content Extraction☆9,696Jan 3, 2025Updated last year
- ☆121Jan 15, 2026Updated 4 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Dense Article Dataset (DAD): A Benchmark Dataset for Document Layout Analysis☆16Jan 13, 2022Updated 4 years ago
- The official implementation of the paper "CrossViewDiff: A Cross-View Diffusion Model for Satellite-to-Street View Synthesis"☆16Sep 2, 2024Updated last year
- Official implementation of Panacea: A foundation model for clinical trial design, recruitment, search, and summarization.☆21Dec 24, 2024Updated last year
- KITE (Knowledge-Intensive Task Evaluation) is an end-to-end benchmark for RAG pipelines☆23Aug 14, 2024Updated last year
- This is a repository for ACMMM22 paper "Exploring Effective Knowledge Transfer for Few-shot Object Detection"☆18Jun 21, 2023Updated 2 years ago
- Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.☆66,024May 31, 2026Updated last week
- ☆10Oct 21, 2024Updated last year
- ☆14Apr 19, 2024Updated 2 years ago
- Blog contents☆10May 11, 2013Updated 13 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆10Feb 26, 2020Updated 6 years ago
- 陆续开源医疗行业的深度学习模型及数据集☆13Dec 30, 2021Updated 4 years ago
- ☆23May 31, 2026Updated last week
- [ICCV 2025] The official implementation of the paper “Street-to-Satellite Image Synthesis with Diffusion Models and BEV Paradigm”☆81Oct 17, 2025Updated 7 months ago
- 公安网备 敏感词过滤词☆14Oct 7, 2018Updated 7 years ago
- [ICCV25 Highlight] The official implementation of the paper "LEGION: Learning to Ground and Explain for Synthetic Image Detection"☆78Oct 22, 2025Updated 7 months ago
- 生僻字OCR识别优化训练☆16Feb 16, 2023Updated 3 years ago
- vllm混合推理扩展插件,支持多NUMA混合推理,单卡推理Qwen3-Next模型可达1000+ prefill☆33Nov 7, 2025Updated 7 months ago
- Official repository for ODQA experiments from Decomposed Prompting: A Modular Approach for Solving Complex Tasks, ICLR23☆12Jul 28, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [ICLR 2025 Spotlight] The official implementation of the paper “LOKI:A Comprehensive Synthetic Data Detection Benchmark using Large Multi…☆179Feb 7, 2026Updated 4 months ago
- A SNOMED CT Concept Validation Library using Drools (Business Rules Engine)☆11May 28, 2026Updated last week
- A Vanilla CNN model on P300 Speller ERP☆10Aug 6, 2019Updated 6 years ago
- Voice activity detection (VAD) library and Go bindings based on WebRTC's VAD engine☆11Mar 1, 2018Updated 8 years ago
- [CVPR 2025] A Comprehensive Benchmark for Document Parsing and Evaluation☆1,784May 6, 2026Updated last month
- [IEEE TVCG 2025] Self-supervised Learning of Event-guided Video Frame Interpolation for Rolling Shutter Frames☆11Jun 1, 2025Updated last year
- (CVPR 2026) TRivia: Self-supervised Fine-tuning of Vision-Language Models for Table Recognition☆34Feb 5, 2026Updated 4 months ago