metachris / pdfx
Extract text, metadata and references (pdf, url, doi, arxiv) from PDF. Optionally download all referenced PDFs.
☆1,049Updated last year
Alternatives and similar repositories for pdfx:
Users that are interested in pdfx are comparing it to the libraries listed below
- MOVED TO https://gitlab.com/crossref/pdfextract☆509Updated 7 years ago
- Content ExtRactor and MINEr☆489Updated 2 years ago
- extract text from any document. no muss. no fuss.☆3,959Updated last month
- Pyzotero: a Python client for the Zotero API☆943Updated this week
- An awesome iTerm2 backend for Matplotlib, so you can plot directly in your terminal.☆1,492Updated last year
- Academic writing with Markdown☆350Updated 3 years ago
- Bringing the python data stack to the shell prompt☆788Updated 3 years ago
- Documentation for Crossref's REST API. For questions or suggestions, see https://community.crossref.org/☆749Updated 4 months ago
- Python API and command-line tool for Sci-Hub☆917Updated 3 years ago
- The simplest way to extract text from PDFs in Python☆427Updated 2 years ago
- Powerful and highly extensible command-line based document and bibliography manager.☆1,449Updated 2 months ago
- Python script to do PDF OCR conversion using Tesseract☆373Updated last year
- A machine learning software for extracting information from scholarly documents☆3,739Updated this week
- Bibcure helps in boring tasks by keeping your bibfile up to date and normalized...also allows you to easily download all papers inside yo…☆201Updated 2 years ago
- A tool to create animated graph visualizations, based on graphviz.☆491Updated last year
- Downloads pdfs via a DOI number, article title or a bibtex file, using the database of libgen(sci-hub) , arxiv☆207Updated 4 years ago
- A pure-python HTML screen-scraping library☆1,869Updated 2 years ago
- IPython magic command to profile and view your python code as a heat map.☆1,032Updated 6 months ago
- Interactive plotting for Python.☆435Updated 4 months ago
- Instant access to many datasets in Python.☆938Updated 2 years ago
- A Python data analysis library that is optimized for humans instead of machines.☆1,173Updated 6 months ago
- Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.☆1,535Updated 9 months ago
- Computing with Python functions.☆3,955Updated this week
- Monitor the output of terminals and processes.☆1,013Updated 9 years ago
- A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.☆2,232Updated 2 years ago
- plotting in the terminal☆1,880Updated 7 months ago
- Convert LaTeX documents into beautiful responsive web pages using LaTeXML.☆1,082Updated last year
- Convert HTML to Markdown-formatted text.☆2,669Updated 11 months ago
- Extract data from websites using basic statistical magic☆505Updated 4 years ago
- Query Google Scholar with Python☆290Updated last year