Given a scholarly PDF, extract figures, tables, captions, and section titles.
β729Mar 10, 2024Updated last year
Alternatives and similar repositories for pdffigures2
Users that are interested in pdffigures2 are comparing it to the libraries listed below
Sorting:
- Command line tool to extract figures, tables, and captions from scholarly documents in PDF form.β129Apr 9, 2018Updated 7 years ago
- Companion code to the paper "Extracting Scientific Figures with Distantly Supervised Neural Networks" π€β146Jun 14, 2022Updated 3 years ago
- Science Parse parses scientific papers (in PDF form) and returns them in structured form.β696May 26, 2024Updated last year
- Science-parse version 2β254Nov 20, 2019Updated 6 years ago
- A machine learning software for extracting information from scholarly documentsβ4,670Updated this week
- Python PDF parser for scientific publications: content and figuresβ452Mar 21, 2024Updated last year
- Processing OpenCitations Dataβ20Aug 17, 2017Updated 8 years ago
- https://doi.org/10.1093/bioinformatics/btz228β44Nov 19, 2024Updated last year
- Python client for GROBID Web servicesβ391Jan 9, 2026Updated last month
- β41May 15, 2020Updated 5 years ago
- A high performance bibliographic information service: https://biblio-glutton.readthedocs.ioβ148Jun 19, 2025Updated 8 months ago
- Multi-Entity Extraction Framework for Academic Documents (with default extraction tools)β31Oct 3, 2023Updated 2 years ago
- δΈδΈͺεΊδΊ PDFFigure2 η PDF εΎθ‘¨θ§£ζζδ»Άβ518Sep 18, 2025Updated 5 months ago
- S2ORC: The Semantic Scholar Open Research Corpus: https://www.aclweb.org/anthology/2020.acl-main.447/β1,016Apr 26, 2024Updated last year
- The OpenCitations metadata model: documents and other material.β19Nov 20, 2025Updated 3 months ago
- SPECTER: Document-level Representation Learning using Citation-informed Transformersβ572Jun 12, 2023Updated 2 years ago
- β119Mar 11, 2025Updated 11 months ago
- A BERT model for scientific text.β1,670Feb 22, 2022Updated 4 years ago
- ARCHIVED R Client for the Lagotto Altmetrics Platformβ15May 10, 2022Updated 3 years ago
- β1,040Jul 9, 2025Updated 7 months ago
- Parsers for scientific papers (PDF2JSON, TEX2JSON, JATS2JSON)β457Apr 11, 2024Updated last year
- wrapper for the crossref events apiβ23May 23, 2023Updated 2 years ago
- Open Access PDF harvesterβ42May 3, 2024Updated last year
- Softcite software mention recognizer, finding mentions and citations to software from within the academic literatureβ82Sep 30, 2025Updated 5 months ago
- Content ExtRactor and MINErβ513Jun 30, 2022Updated 3 years ago
- Software for creating all the OpenCitations Indexes (e.g. COCI)β15Feb 19, 2026Updated 2 weeks ago
- library supporting NLP and CV research on scientific papersβ789Nov 8, 2024Updated last year
- PDF to XML ALTO file converterβ264Feb 11, 2026Updated 3 weeks ago
- An experimental desktop client for using Claude Desktop's MCP with Novelcrafter codices.β10Dec 3, 2024Updated last year
- Consider is a parser for the ThinkGear protocol used by NeuroSky devices (MindSet, BrainBand and others).β16Apr 3, 2012Updated 13 years ago
- Dataset accompanying the SPECTER modelβ143Dec 19, 2022Updated 3 years ago
- Download metadata for all DOIs using the Crossref APIβ66Sep 25, 2018Updated 7 years ago
- Collection of public APIs for embedding scientific papersβ59Feb 19, 2021Updated 5 years ago
- Toolkit for Zotero Plugin Developers.β185Feb 17, 2026Updated 2 weeks ago
- A plugin template for Zotero.β785Feb 25, 2026Updated last week
- Hello world demonstration for Weblateβ14Jan 20, 2026Updated last month
- Citation Manager for OJSβ13Jun 4, 2024Updated last year
- β11Dec 10, 2022Updated 3 years ago
- Go bindings for emokitβ11Mar 12, 2014Updated 11 years ago