ad-freiburg / pdfactView external linksLinks
A basic tool that extracts the structure from the PDF files of scientific articles.
☆76Jan 4, 2022Updated 4 years ago
Alternatives and similar repositories for pdfact
Users that are interested in pdfact are comparing it to the libraries listed below
Sorting:
- A project about benchmarking and evaluating existing PDF extraction tools on their semantic abilities to extract the body texts from PDF …☆69Nov 7, 2020Updated 5 years ago
- Named Entity Disambiguation and Linking☆16May 24, 2024Updated last year
- my take at a PDF text extraction utility☆25Jun 15, 2015Updated 10 years ago
- PDF Extraction Toolkit☆42Nov 23, 2020Updated 5 years ago
- Batch scripts curating BioRxiv and PubMed articles by using Altmetric score.☆11May 9, 2020Updated 5 years ago
- Natural Language to SQL Queries in the OMOP CDM Datasets☆11Jun 12, 2023Updated 2 years ago
- Convert ALTO XML to plain text + minimal metadata☆17Oct 17, 2024Updated last year
- table understanding dataset for comparative evaluation of different table understanding algorithms☆14Jun 15, 2018Updated 7 years ago
- Tokenize and clean strings in Python☆11Jan 11, 2018Updated 8 years ago
- Computer Vision Segmentation for Document Layout Analysis☆10Sep 26, 2022Updated 3 years ago
- Spell checker using Brill and Moore's noisy channel error model☆12Jan 9, 2019Updated 7 years ago
- Recommendation engine for scholarly articles☆12Oct 22, 2019Updated 6 years ago
- Access different AI models in a one place☆22Jul 31, 2023Updated 2 years ago
- SIGIR'20: An Analysis of BERT in Document Ranking☆21Jul 27, 2020Updated 5 years ago
- Systematic Review Query Visualisation and Understanding Interface☆17Dec 5, 2025Updated 2 months ago
- ☆20Jul 22, 2021Updated 4 years ago
- Simple face alignment library by using face_recognition and opencv☆16Mar 13, 2019Updated 6 years ago
- TeXoo – A Zoo of Text Extractors☆18Jun 2, 2020Updated 5 years ago
- Flint SPARQL editor☆51Oct 16, 2012Updated 13 years ago
- ☆15Feb 18, 2019Updated 6 years ago
- A toolkit for asynchronously validating dense retriever checkpoints during training.☆27Aug 10, 2023Updated 2 years ago
- Raven is a Web application penetration testing tool.☆17Jun 16, 2021Updated 4 years ago
- Extract text from your DOCX documents.☆11Feb 10, 2024Updated 2 years ago
- Document Layout Analysis Projects☆23Sep 4, 2019Updated 6 years ago
- Analyze XML extracted from PDFs (e.g. from TET or PDFMiner)☆20Jan 11, 2018Updated 8 years ago
- ☆25Oct 27, 2020Updated 5 years ago
- ☆26Nov 22, 2022Updated 3 years ago
- 📑 Python Package to reconstruct the original continuous text from PDFs with language models☆32Sep 8, 2023Updated 2 years ago
- Jurisdiction ID and abbreviation data files for using with Jurism and other projects.☆41Nov 8, 2023Updated 2 years ago
- A Python interface to PISA☆37Sep 23, 2025Updated 4 months ago
- ProgramCMS is a complete, trustworthy CMS & easy to use PHP Framework to build and deploy All kind of Web Sites. Please note that Program…☆18Apr 30, 2025Updated 9 months ago
- Graph-based tool for disambiguation and linking of named entities to Linked Data sets for Digital Humanities and heritage texts☆28Sep 20, 2021Updated 4 years ago
- A library for minimizing the effects of confounding covariates☆15May 28, 2025Updated 8 months ago
- Web plateforme for collaborative text analytics☆34Jul 6, 2022Updated 3 years ago
- ☆70Apr 3, 2018Updated 7 years ago
- The HfG Documentation Generator☆15Apr 15, 2025Updated 9 months ago
- ☆33Nov 16, 2022Updated 3 years ago
- CityDPC: A Python Library for 3D City Model Processing☆12Nov 18, 2025Updated 2 months ago
- ☆17Dec 30, 2025Updated last month