SciKnowEngine / lapdftextLinks
LA-PDFText is a system for extracting accurate text from PDF-based research articles (and an interface to be able to improve performance where needed). The system is open-source and provides a simple baseline function for extracting text from primary research articles using rules that developers can customize. This means that the system works qu…
☆15Updated 6 years ago
Alternatives and similar repositories for lapdftext
Users that are interested in lapdftext are comparing it to the libraries listed below
Sorting:
- LA-PDFText is a system for extracting accurate text from PDF-based research articles (and an interface to be able to improve performance …☆81Updated 7 years ago
- Multi-Entity Extraction Framework for Academic Documents (with default extraction tools)☆31Updated 2 years ago
- PDF to XML ALTO file converter☆257Updated last month
- The sample web app for the yFiles use case about an Ontology Visualizer.☆12Updated 8 months ago
- ☆32Updated 3 years ago
- DaSCH Service Platform API☆76Updated this week
- RightField is an open-source tool for adding ontology term selection to Excel spreadsheets. RightField is used by a 'Template Creator' to…☆31Updated 8 months ago
- A set of workflows for corpus building through OCR, post-correction and normalisation☆49Updated 3 years ago
- Specification of NAF, the NLP annotation format☆21Updated 4 years ago
- Conversion and validation for JATS XML☆55Updated 5 months ago
- Basic and Advanced OBO Graphs: specification and reference implementation☆68Updated 2 months ago
- Knowledge graph construction: Fast inserts into a Wikibase instance☆46Updated 3 years ago
- 📦 The Knowledge Box - A data dependency management framework to help users to publish, find and install data models☆47Updated 5 months ago
- Conversions between various OCR formats☆82Updated 2 years ago
- Web based JavaScript GUI library for proofreading/editing hOCR☆100Updated 7 years ago
- A Knowledge Graph-based Semantic Database for Biomedical Sciences☆28Updated 7 years ago
- Advanced graph rewriting and LLOD publication for CoNLL and other TSV formats☆25Updated 5 months ago
- Graph-based tool for disambiguation and linking of named entities to Linked Data sets for Digital Humanities and heritage texts☆28Updated 4 years ago
- Text conversion tool (from e.g. Word, HTML, txt) to corpus formats TEI or FoLiA)☆23Updated 3 years ago
- A high performance bibliographic information service: https://biblio-glutton.readthedocs.io☆146Updated 6 months ago
- A machine learning tool for fishing entities☆266Updated 6 months ago
- Neuralized version of the Reference String Parser component of the ParsCit package.☆81Updated 3 years ago
- Ergonomic line-by-line transcription of scanned text.☆54Updated 5 years ago
- neonion is a user-centered collaborative semantic annotation webapp developed at the Human-Centered Computing group at Freie Universität …☆68Updated 6 years ago
- Text annotation tool for team collaboration☆43Updated last year
- Semantic Annotation Without the Pointy Brackets☆158Updated last year
- OpenRefine Reconciliation Framework in Python and Flask☆21Updated 2 years ago
- Tool that does layout analysis and/or text recognition using tesseract and outputs the result in Page XML format☆46Updated 8 months ago
- The Pelagios Exploration Engine☆21Updated 4 years ago
- Convert between Tesseract hOCR and ALTO XML using XSL stylesheets☆58Updated 2 months ago