dell-research-harvard/HJDataset

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/dell-research-harvard/HJDataset)

dell-research-harvard / HJDataset

A Large Dataset of Historical Japanese Documents with Complex Layouts

☆37

Alternatives and similar repositories for HJDataset

Users that are interested in HJDataset are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

OCR-D / ocrd_pagetopdf
View on GitHub
OCR-D wrapper for prima-pagetopdf
☆10Oct 30, 2025Updated 8 months ago
kohei-kawaguchi / TestAI
View on GitHub
☆19Mar 17, 2026Updated 4 months ago
HCIILAB / M5HisDoc
View on GitHub
☆34Dec 18, 2025Updated 7 months ago
HCIILAB / SCUT-CAB_Dataset_Release
View on GitHub
☆31May 8, 2025Updated last year
kba / transkribus-to-prima
View on GitHub
Convert Transkribus PAGE-XML to standard PAGE-XML
☆12Dec 10, 2025Updated 7 months ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
uniwue-zpd / PAGETools
View on GitHub
Small collection of PAGE XML related scripts used at the ZPD Würzburg
☆12Aug 2, 2024Updated last year
lancercat / OSOCR
View on GitHub
☆10Nov 21, 2023Updated 2 years ago
Levi-ZJY / SAN
View on GitHub
SAN: Structure-Aware Network for Complex and Long-tailed Chinese Text Recognition
☆10Apr 8, 2024Updated 2 years ago
ssocean / AlphX-Code-For-DAR
View on GitHub
粤港澳大湾区（黄埔）国际算法算例大赛-古籍文档图像识别与分析算法比赛 Alphx队源码
☆46Mar 16, 2023Updated 3 years ago
ihdia / seamformer
View on GitHub
Official repository accompaying the ICDAR 2023 paper
☆14Oct 3, 2023Updated 2 years ago
FactoDeepLearning / LinePytorchOCR
View on GitHub
☆17Feb 16, 2023Updated 3 years ago
cerp-analytics / pbs2017
View on GitHub
This repository contains digitized data from Pakistan Bureau of Statistics's 2017 Census results. We converted them to csv format to help…
☆10Nov 11, 2021Updated 4 years ago
FIWARE / tutorials.Identity-Management
View on GitHub
FIWARE 401: IDM - Managing Users and Organizations
☆10May 15, 2026Updated 2 months ago
ihdia / instance-segmentation-v1
View on GitHub
☆10Jan 22, 2023Updated 3 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
KaixuanZ / PR1956
View on GitHub
☆13Nov 8, 2020Updated 5 years ago
algorithmica-repository / deep-learning
View on GitHub
It consists of all code examples discussed as part of deep learning course taken at algorithmica
☆11Oct 1, 2020Updated 5 years ago
performant-software / neatline-omeka-s
View on GitHub
A module for Omeka S that provides an API for the Neatline 3 single page application
☆18Mar 26, 2023Updated 3 years ago
HCIILAB / TKH_MTH_Datasets_Release
View on GitHub
The Tripitaka Koreana in Han (TKH) Dataset and the Multiple Tripitaka in Han (MTH) Dataset for the research of Chinese character detectio…
☆73Sep 23, 2020Updated 5 years ago
oriflamms / HORAE
View on GitHub
☆14Jun 18, 2026Updated last month
a8dx / Stata-Tools
View on GitHub
Template code for exporting Stata regression output to beautiful LaTeX tables
☆18May 27, 2016Updated 10 years ago
kuieless / Aids-for-X-anylabeling
View on GitHub
☆17Nov 6, 2025Updated 8 months ago
duanjiaqi / PMTD
View on GitHub
Pyramid Mask Text Detector designed by SenseTime Video Intelligence Research team.
☆14Aug 1, 2019Updated 6 years ago
dh-linghaibin / android-IAP-stm32
View on GitHub
android 读取hex文件通过蓝牙下载到 stm32单片机
☆11Nov 6, 2017Updated 8 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
zaratsian / Datasets
View on GitHub
Interesting Public Datasets
☆12Apr 28, 2023Updated 3 years ago
sergiocorreia / stata-require
View on GitHub
Enforce exact/minimum versions of community-contributed packages.
☆19May 10, 2024Updated 2 years ago
Form2Seq-Data / Dataset
View on GitHub
Dataset corresponding to the paper: "Form2Seq : A Framework for Higher-Order Form Structure Extraction"
☆10Feb 17, 2021Updated 5 years ago
brentzucker / brownlee
View on GitHub
Code snippets from Jason Brownlee's ML and Deep Learning books.
☆12Mar 22, 2017Updated 9 years ago
MAEHCM / AET
View on GitHub
Code for AAAI 2023 Paper : “Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models”
☆18Dec 6, 2022Updated 3 years ago
vanderLaan-Group / vanderLaan-lab.org
View on GitHub
Website and blog for the research group of Mark J. van der Laan
☆11Jul 1, 2021Updated 5 years ago
hchulkim / econ-paper-template
View on GitHub
This is a Repo for the econ working paper template.
☆21Apr 11, 2026Updated 3 months ago
luisguiserrano / machine-learning
View on GitHub
Content for Udacity's Machine Learning curriculum
☆15Jul 29, 2016Updated 9 years ago
Leoperon / Codes
View on GitHub
100DaysOfCode
☆11Aug 18, 2020Updated 5 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
algorithmica-repository / big-datascience
View on GitHub
It consists of all the code examples of big datascience course taken at algorithmica
☆13Oct 7, 2018Updated 7 years ago
My-Azure-Projects / AzureDevops
View on GitHub
This project was created as a personal learning process. It is a simple example implementation of Azure Devops & Nodejs Application (Angu…
☆11Oct 10, 2020Updated 5 years ago
itversity / spark-sql
View on GitHub
Apache Spark using SQL
☆14Aug 18, 2021Updated 4 years ago
iKrishneel / detectron2_timm
View on GitHub
A simple wrapper library for binding timm models as detectron2 backbones
☆45May 31, 2023Updated 3 years ago
jaz-alli / k-NN-Tutorial-Scikit-learn
View on GitHub
This is the notebook that goes along with the 'Building a k-NN model with Scikit-learn' tutorial on Medium.
☆10Sep 26, 2018Updated 7 years ago
Joshua-omolewa / Retailstore_ETL_pipeline_project
View on GitHub
Built a Data Pipeline for a Retail store using AWS services that collects data from its transactional database (OLTP) in Snowflake and tr…
☆13May 25, 2023Updated 3 years ago
gioannides / Gaussian-Adaptive-Attention
View on GitHub
☆27May 22, 2024Updated 2 years ago