datasets resource
☆132Jul 1, 2025Updated 8 months ago
Alternatives and similar repositories for opendatalab-datasets
Users that are interested in opendatalab-datasets are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Data Set Description Language Specification (新一代人工智能数据集描述语言DSDL)☆46May 29, 2024Updated last year
- WanJuan-CC是以CommonCrawl为基础,经过数据抽取,规则清洗,去重,安全过滤,质量清洗等步骤得到的高质量数据。☆13Apr 18, 2024Updated last year
- Data annotation toolbox supports image, audio and video data.☆1,524Mar 20, 2026Updated last week
- MLLM-DataEngine: An Iterative Refinement Approach for MLLM☆48May 24, 2024Updated last year
- ECCV2024_Parrot Captions Teach CLIP to Spot Text☆66Sep 6, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- The Open-Source Data Annotation Platform☆1,197Feb 19, 2025Updated last year
- Out-of-the-box Annotation Toolbox☆395Apr 19, 2024Updated last year
- WanJuan3.0(“万卷·丝路”)一个作为综合性的纯文本语料库,采集了多个国家地区的网络公开信息、文献、专利等资料,数据总规模超1.2TB,Token总数超过300B,处于国际领先水平,首期开源的语料库主要由泰语、俄语、阿拉伯语、韩语和越南语5个子集构成,每个子集的数据…☆43Feb 13, 2025Updated last year
- UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition☆460Sep 28, 2025Updated 6 months ago
- (ICCV 2025) OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation☆94Dec 3, 2025Updated 3 months ago
- 万卷1.0多模态语料☆571Oct 20, 2023Updated 2 years ago
- A Python package for interacting with the MinerU Vision-Language Model.☆109Updated this week
- MPB (Miner-PDF-Benchmark) is an end-to-end PDF document comprehension evaluation suite designed for large-scale model data scenarios.☆24Dec 11, 2024Updated last year
- A Comprehensive Toolkit for High-Quality PDF Content Extraction☆9,503Jan 3, 2025Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- The source code for “Homophily-Related: Adaptive Hybrid Graph Filter for Multi-View Graph Clustering”☆10Apr 10, 2024Updated last year
- Dense Article Dataset (DAD): A Benchmark Dataset for Document Layout Analysis☆16Jan 13, 2022Updated 4 years ago
- PaperPub is an academic arena where diverse AI Agents read papers daily, pick apart each other's arguments, and fiercely debate.☆43Updated this week
- DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception☆2,072Apr 14, 2025Updated 11 months ago
- Normal Learning in Videos with Attention Prototype Network☆18Jan 19, 2023Updated 3 years ago
- The official implementation of the paper "CrossViewDiff: A Cross-View Diffusion Model for Satellite-to-Street View Synthesis"☆16Sep 2, 2024Updated last year
- Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.☆57,475Updated this week
- This is a repository for ACMMM22 paper "Exploring Effective Knowledge Transfer for Few-shot Object Detection"☆17Jun 21, 2023Updated 2 years ago
- ☆14Apr 19, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [ICCV25 Highlight] The official implementation of the paper "LEGION: Learning to Ground and Explain for Synthetic Image Detection"☆75Oct 22, 2025Updated 5 months ago
- A PyTorch implementation of Cyclical Learning Rates☆25Jan 30, 2018Updated 8 years ago
- 陆续开源医疗行业的深度学习模型及数据集☆13Dec 30, 2021Updated 4 years ago
- The complete NUMA-optimized branch of the ktransformers project☆25Nov 3, 2025Updated 4 months ago
- vllm混合推理扩展插件,支持多NUMA混合推理,单卡推理Qwen3-Next模型可达1000+ prefill☆31Nov 7, 2025Updated 4 months ago
- GRPO Algorithm for Llava Architecture (Based on Verl)☆49May 9, 2025Updated 10 months ago
- Official repository for ODQA experiments from Decomposed Prompting: A Modular Approach for Solving Complex Tasks, ICLR23☆12Jul 28, 2023Updated 2 years ago
- [ICLR 2025 Spotlight] The official implementation of the paper “LOKI:A Comprehensive Synthetic Data Detection Benchmark using Large Multi…☆176Feb 7, 2026Updated last month
- [CVPR 2025] A Comprehensive Benchmark for Document Parsing and Evaluation☆1,612Feb 27, 2026Updated last month
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- This repository is calibrate muti lidar, lidar and imu, lidar and camera, sing lidar☆13Oct 7, 2020Updated 5 years ago
- Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).☆7,174Oct 30, 2025Updated 5 months ago
- Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization☆100Jan 30, 2024Updated 2 years ago
- InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions☆2,923May 26, 2025Updated 10 months ago
- A simple dictionary in Manchu, Chinese and English.☆13Feb 27, 2015Updated 11 years ago
- Dingo: A Comprehensive AI Data, Model and Application Quality Evaluation Tool☆666Mar 23, 2026Updated last week
- Praat scripting入门☆15Apr 8, 2025Updated 11 months ago