CrossRef/pdfextract

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/CrossRef/pdfextract)

CrossRef / pdfextract

MOVED TO https://gitlab.com/crossref/pdfextract

☆510

Alternatives and similar repositories for pdfextract

Users that are interested in pdfextract are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

metachris / pdfx
View on GitHub
Extract text, metadata and references (pdf, url, doi, arxiv) from PDF. Optionally download all referenced PDFs.
☆1,076Jun 15, 2023Updated 3 years ago
CeON / CERMINE
View on GitHub
Content ExtRactor and MINEr
☆512Jun 30, 2022Updated 4 years ago
knmnyn / ParsCit
View on GitHub
An open-source CRF Reference String Parsing Package
☆161May 6, 2020Updated 6 years ago
academia-edu / biblicit
View on GitHub
Extract citations from PDFs.
☆28Feb 26, 2014Updated 12 years ago
grobidOrg / grobid
View on GitHub
A machine learning software for extracting information from scholarly documents
☆5,005Updated this week
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
CrossRef / pdfmark
View on GitHub
MOVED TO https://gitlab.com/crossref/pdfmark
☆34Nov 22, 2018Updated 7 years ago
ckorzen / icecite
View on GitHub
The repository of Icecite, a research paper management system.
☆15Mar 29, 2018Updated 8 years ago
inspirehep / refextract
View on GitHub
Extract bibliographic references from (High-Energy Physics) articles.
☆143Apr 16, 2026Updated 3 months ago
CrossRef / rest-api-doc
View on GitHub
Documentation for Crossref's REST API. For questions or suggestions, see https://community.crossref.org/
☆798Sep 25, 2024Updated last year
ckreibich / scholar.py
View on GitHub
A parser for Google Scholar, written in Python
☆2,176Jul 14, 2026Updated last week
agisga / mixed_models
View on GitHub
Statistical mixed effects models in Ruby
☆21Jul 8, 2016Updated 10 years ago
SciRuby / statsample-glm
View on GitHub
Generalized Linear Models extension for Statsample
☆24Jan 24, 2019Updated 7 years ago
WING-NUS / Neural-ParsCit
View on GitHub
Neuralized version of the Reference String Parser component of the ParsCit package.
☆81May 27, 2022Updated 4 years ago
ropensci-archive / alm
View on GitHub
ARCHIVED R Client for the Lagotto Altmetrics Platform
☆15May 10, 2022Updated 4 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
greenelab / crossref
View on GitHub
Download metadata for all DOIs using the Crossref API
☆66Sep 25, 2018Updated 7 years ago
Phyks / libbmc
View on GitHub
A python library to deal with scientific papers.
☆17Apr 2, 2016Updated 10 years ago
oaworks / plugin
View on GitHub
The One True Open Access Button - cross-compatible extension for research papers and data.
☆48Oct 8, 2024Updated last year
rufuspollock-okfn / bibserver
View on GitHub
BibServer is open-source software what makes it easy to publish, manage and find bibliographies. BibServer is RESTful and web-friendly.
☆126Jan 31, 2019Updated 7 years ago
codeforscience / Dat-in-the-Lab
View on GitHub
The Dat in the Lab project
☆33Jun 20, 2019Updated 7 years ago
SeerLabs / pdfmef
View on GitHub
Multi-Entity Extraction Framework for Academic Documents (with default extraction tools)
☆31Oct 3, 2023Updated 2 years ago
dpriskorn / OpenAlexAPI
View on GitHub
Python library for the OpenAlex HTTP API
☆23Feb 25, 2023Updated 3 years ago
18F / api-program
View on GitHub
A complete agency API program.
☆12Apr 27, 2017Updated 9 years ago
proquest / PME
View on GitHub
Publication Metadata Extraction
☆16Aug 31, 2025Updated 10 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
zotero / translation-server
View on GitHub
A Node.js-based server to run Zotero translators
☆159Apr 28, 2026Updated 2 months ago
okfn / pdftables
View on GitHub
A library for extracting tables from PDF files
☆89Sep 27, 2013Updated 12 years ago
kermitt2 / biblio-glutton-extension
View on GitHub
A browser extension providing Open Access bibliographical services
☆18Dec 9, 2022Updated 3 years ago
CulturePlex / Sylva
View on GitHub
A Relaxed Schema Graph Database Management System
☆53Feb 17, 2020Updated 6 years ago
dansheffler / zettelkasten-wiki
View on GitHub
An Atom package for creating a zettelkasten style wiki. Should be used with my Academic-Markdown syntax file
☆12Jun 3, 2021Updated 5 years ago
minad / bibsync
View on GitHub
BibSync is a tool to synchronize scientific papers and bibtex bibliography files
☆59Apr 20, 2014Updated 12 years ago
tabulapdf / tabula-extractor
View on GitHub
Extract tables from PDF files
☆358May 17, 2016Updated 10 years ago
ContentMine / getpapers
View on GitHub
Get metadata, fulltexts or fulltext URLs of papers matching a search query
☆201Jul 15, 2020Updated 6 years ago
timtylin / scholdoc
View on GitHub
Fork of Pandoc for the implementation of a ScholarlyMarkdown parser
☆337Jun 14, 2015Updated 11 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
venthur / gscholar
View on GitHub
Query Google Scholar with Python
☆300Nov 24, 2025Updated 7 months ago
octopus-platform / octopus
View on GitHub
Generic server for collaborative code analysis
☆13Dec 19, 2016Updated 9 years ago
grobidOrg / grobid-client-node
View on GitHub
Simple node.js client for GROBID REST services
☆23Jun 6, 2019Updated 7 years ago
benosteen / pairtree
View on GitHub
Python Pairtree implementation
☆19Mar 26, 2018Updated 8 years ago
dpapathanasiou / pdfminer-layout-scanner
View on GitHub
A more complete example of programming with PDFMiner, which continues where the default documentation stops
☆216Dec 3, 2019Updated 6 years ago
Atcold / torch-net-toolkit
View on GitHub
A simple module for <Torch7> and the <nn> package
☆18Apr 30, 2015Updated 11 years ago
lazyprogrammer / matlab-probability-class
View on GitHub
Resources and Materials for MATLAB Probability class
☆10Oct 23, 2015Updated 10 years ago