data-liberation / data-liberation-resourcesLinks
liberate all kinds of data from PDF and other unstructural format and make the information machine-readable and visualizeable for popular tools.
☆31Updated 7 years ago
Alternatives and similar repositories for data-liberation-resources
Users that are interested in data-liberation-resources are comparing it to the libraries listed below
Sorting:
- baike schema crawler for baidu baike , hudongbaike. 面向百度百科与互动百科的概念分类体系抓取脚本☆38Updated 7 years ago
- table understanding dataset for comparative evaluation of different table understanding algorithms☆14Updated 7 years ago
- Framework for information extraction from tables☆41Updated 6 years ago
- Extract templated Open Information Extraction☆17Updated 8 years ago
- 中文环境突发事件语料库(Chinese Environment Emergency Corpus)-上海大学-语义智能实验室☆46Updated 10 years ago
- ☆70Updated 7 years ago
- ☆23Updated 6 years ago
- ICDAR 2021 Competition on Scientific Literature Parsing☆35Updated 5 years ago
- SegPhrase working on Chinese and Arabic☆36Updated 9 years ago
- ☆95Updated 5 years ago
- Optical table recognition - recognize tables in scan images using OpenCV☆112Updated 6 years ago
- Data collection, alignment and TAUS repository☆23Updated 8 years ago
- detect the table image in pdf or other format image by opencv and python .☆54Updated 6 years ago
- 基于CEC语料库挖掘要素识别规则,对新闻报道类生语料进行自动标注☆20Updated 10 years ago
- Table Extraction Tool☆90Updated 7 years ago
- This repository contains a 403 images dataset for table detection in documents.☆83Updated 7 years ago
- BlackLab Frontend, a feature-rich corpus search interface for BlackLab.☆22Updated this week
- ☆87Updated 5 years ago
- ☆12Updated 5 years ago
- Summary of Responses to Questionnaire on Annotation Platform https://forms.gle/iZk8kehkjAWmB8xe9☆60Updated 5 years ago
- ☆40Updated 5 years ago
- Code for ICPR2022 paper: "Graph Neural Networks and Representation Embedding for table extraction in PDF Documents"☆37Updated 2 years ago
- PDF table extraction☆10Updated 4 years ago
- Post-processing OCR errors with seq2seq models☆28Updated 5 years ago
- An open-source classical Chinese information processing toolkit developed by Tsinghua Natural Language Processing Group☆51Updated 7 years ago
- ☆82Updated 3 years ago
- Tools for extract figure, table, text, .. from a pdf document.☆34Updated 5 years ago
- AI_DocumentLayoutAnalysis☆39Updated 5 years ago
- 一个相对完整的文档分析和识别项目☆144Updated 6 years ago
- A benchmark corpus of 100 English novels, covering the 19th and the beginning of the 20th century☆24Updated 3 years ago