metachris / pdfxLinks
Extract text, metadata and references (pdf, url, doi, arxiv) from PDF. Optionally download all referenced PDFs.
☆1,071Updated 2 years ago
Alternatives and similar repositories for pdfx
Users that are interested in pdfx are comparing it to the libraries listed below
Sorting:
- MOVED TO https://gitlab.com/crossref/pdfextract☆510Updated 8 years ago
- Content ExtRactor and MINEr☆507Updated 3 years ago
- Query Google Scholar with Python☆294Updated 2 months ago
- Documentation for Crossref's REST API. For questions or suggestions, see https://community.crossref.org/☆779Updated last year
- Academic writing with Markdown☆354Updated 4 years ago
- Detect text blocks and OCR poorly scanned PDFs in bulk. Python module available via pip.☆1,279Updated 4 years ago
- Pyzotero: a Python client for the Zotero API☆1,124Updated 2 weeks ago
- Scripts for Latex to HTML5 conversion☆717Updated 2 years ago
- A parser for Google Scholar, written in Python☆2,162Updated 3 years ago
- 📚 Web of Science python client☆217Updated 2 years ago
- A set of tools to allow PDF to XML conversion, utilising Apache Beam and other tools. The aim of this project is to bring multiple tools…☆294Updated 3 years ago
- Convert LaTeX documents into beautiful responsive web pages using LaTeXML.☆1,097Updated last year
- A Python library for creating LaTeX files☆2,346Updated last year
- ReproZip is a tool that simplifies the process of creating reproducible experiments from command-line executions, a frequently-used commo…☆351Updated 11 months ago
- Extract data from websites using basic statistical magic☆505Updated 5 years ago
- Create, edit and display a journal article, entirely in GitHub☆619Updated 2 years ago
- Bibtex parser for Python 3☆553Updated 10 months ago
- pdf watermark removal library for academic papers☆552Updated 5 years ago
- Bringing the python data stack to the shell prompt☆787Updated 4 years ago
- Easy color scales and color conversion for Python.☆263Updated 9 months ago
- A toolkit for making domain-specific probabilistic parsers☆806Updated last year
- Fork of Pandoc for the implementation of a ScholarlyMarkdown parser☆334Updated 10 years ago
- This project contains the source code of a tool for generating regular expressions for text extraction: 1. automatically, 2. based only …☆953Updated 5 years ago
- The ReScience journal. Reproducible Science is Good. Replicated Science is better.☆703Updated 3 years ago
- Import tables from any Wikipedia article as a dataset in Python☆293Updated 4 years ago
- An interactive data visualization tool which brings matplotlib graphics to the browser using D3.☆2,394Updated last week
- High-level build project for all LAPDF-Text submodules☆103Updated 10 years ago
- Interactive plotting for Python.☆438Updated 5 months ago
- a utility to extract the title from a PDF file☆143Updated 8 months ago
- Python command-line script for converting .csv data to LaTeX tables☆222Updated 6 years ago