LA-PDFText is a system for extracting accurate text from PDF-based research articles (and an interface to be able to improve performance where needed). The system is open-source and provides a simple baseline function for extracting text from primary research articles using rules that developers can customize. This means that the system works qu…
☆82Mar 2, 2018Updated 8 years ago
Alternatives and similar repositories for lapdftext
Users that are interested in lapdftext are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- High-level build project for all LAPDF-Text submodules☆103Jul 2, 2015Updated 10 years ago
- LA-PDFText is a system for extracting accurate text from PDF-based research articles (and an interface to be able to improve performance …☆15Mar 21, 2019Updated 7 years ago
- Web-based page layout editor created for EMOP (Early Modern OCR Project).☆11May 21, 2021Updated 5 years ago
- A basic tool that extracts the structure from the PDF files of scientific articles.☆77Jan 4, 2022Updated 4 years ago
- Babel creates cliques of equivalent identifiers across many biomedical vocabularies.☆14Updated this week
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆13Sep 2, 2024Updated last year
- An ordered Python dictionary with attribute-style access.☆16Jun 23, 2020Updated 6 years ago
- Transforming SemRep Predications into an Open Biomedical Linked Data Resource☆11Jan 26, 2018Updated 8 years ago
- REx: Relation Extraction. Modernized re-write of the code in the master's thesis: "Relation Extraction using Distant Supervision, SVMs, a…☆22Mar 7, 2018Updated 8 years ago
- Post-processing OCR errors with seq2seq models☆28Jul 30, 2020Updated 5 years ago
- Make quickfix entries editable☆36Jul 6, 2014Updated 11 years ago
- ☆29Mar 13, 2018Updated 8 years ago
- An R data package for NIH EXPORTER data☆15Mar 8, 2025Updated last year
- An R package for working with phenotypic screening data☆10Nov 22, 2018Updated 7 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆19May 10, 2024Updated 2 years ago
- Hierarchical Embedding for Drugs☆17Apr 18, 2024Updated 2 years ago
- several algorithms for converting dependency structures into constituency structures.☆10Feb 7, 2022Updated 4 years ago
- 🐸 Idiomatic conversion between URIs and compact URIs (CURIEs) in Python☆27Jun 16, 2026Updated 2 weeks ago
- ☆13Jul 11, 2017Updated 8 years ago
- ☆31Mar 7, 2017Updated 9 years ago
- Ginkgo SARS-CoV-2 synthesis efforts: overviews and data☆13Jun 7, 2020Updated 6 years ago
- CQRS & EventSourcing library for php >= 5.5☆14Oct 20, 2016Updated 9 years ago
- Content ExtRactor and MINEr☆512Jun 30, 2022Updated 4 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A helper that integrates Pydantic with requests library for seamless access to defined Models☆11Mar 9, 2022Updated 4 years ago
- Keepass HTTP implementation in Python to use with ChromeIPass☆25Jan 21, 2022Updated 4 years ago
- Quantify extrapolation of new samples given a training set☆48Feb 28, 2026Updated 4 months ago
- Zipkin tracing for Scala Futures and non-Futures (synchronous operations)☆21May 11, 2017Updated 9 years ago
- Hetnets in Python (relocated from dhimmel/hetio)☆103Sep 4, 2025Updated 9 months ago
- code to remove "noise" from hOCR output of Tesseract OCR.☆14Oct 24, 2016Updated 9 years ago
- Sequence Labeling Parsing by Learning Across Representations☆13Oct 3, 2019Updated 6 years ago
- Distillation of Ensemble Dependency Parsers into a Single Graph-Based Parser☆11Oct 14, 2016Updated 9 years ago
- Computer Vision tutorial for DH Summer School Antwerp☆11Jun 16, 2026Updated 2 weeks ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆24Jun 16, 2026Updated 2 weeks ago
- Core libraries by the PRImA Research Lab☆16Jul 30, 2024Updated last year
- Financial Market Building Blocks☆12Feb 1, 2022Updated 4 years ago
- A knowledge resource on cell lines - From SIB CALIPHO group☆14Jan 31, 2024Updated 2 years ago
- compound figure separation tool☆22Jun 13, 2024Updated 2 years ago
- Tool that does layout analysis and/or text recognition using tesseract and outputs the result in Page XML format☆47Mar 31, 2025Updated last year
- Extract and find/replace text based on arbitrary correspondences while preserving original file formatting. This library is a fork from t…☆11Sep 8, 2023Updated 2 years ago