NLPatVCU / PaperScraper
A web scraping tool to systematically extract the text of scientific papers and corresponding metadata from university accessible journals.
☆188Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for PaperScraper
- Uses publisher APIs to programmatically retrieve scientific journal articles for text mining.☆119Updated 10 months ago
- PyPaperBot is a Python tool for downloading scientific papers using Google Scholar, Crossref, SciHub, and SciDB.☆389Updated last week
- Tools to scrape publication metadata from pubmed, arxiv, medrxiv and chemrxiv.☆238Updated this week
- A toolkit for automatically extracting semantic information from PDF files of scientific articles☆65Updated 10 months ago
- A python library that implements the Crossref API.☆280Updated last month
- A Python module for use with Elsevier's APIs: Scopus, ScienceDirect, others.☆367Updated 2 years ago
- A curated collection of resources on scholarly data analysis ranging from datasets, papers, and code about bibliometrics, citation analys…☆178Updated last year
- Python PDF parser for scientific publications: content and figures☆353Updated 7 months ago
- litreviewer is a Python package (collection of few Python modules) that helps researchers perform crawling, scraping, collecting (corpus)…☆38Updated 3 months ago
- A proof of concept to scrape papers from journals☆246Updated 5 months ago
- client for Crossref search API☆203Updated this week
- Python-based API-Wrapper to access Scopus☆423Updated this week
- Parsers for scientific papers (PDF2JSON, TEX2JSON, JATS2JSON)☆347Updated 6 months ago
- Python client for GROBID Web services☆284Updated 2 weeks ago
- A Python library for OpenAlex (openalex.org)☆159Updated this week
- An unofficial api for downloading papers from SciHub via DOI, PMID, title☆208Updated 8 months ago
- Open Access PDF harvester, metadata aggregator and full-text ingester☆54Updated 6 months ago
- Science-parse version 2☆231Updated 4 years ago
- Simple python parser for MEDLINE, Pubmed OA affiliation string☆37Updated 3 years ago
- Python toolkit for NCBI metadata (via eutils) and pubmed article text mining -- official primary repo.☆95Updated 2 months ago
- Public release of data and code for materials synthesis generation☆71Updated 2 years ago
- S2ORC: The Semantic Scholar Open Research Corpus: https://www.aclweb.org/anthology/2020.acl-main.447/☆828Updated 6 months ago
- Search for and retrieve US Patent and Trademark Office Patent Data☆76Updated 4 years ago
- Automatic synthesis of RCTs☆140Updated 2 years ago
- A Python package to download full article PDFs from OA publications☆38Updated 3 months ago
- Python script to download papers from sci-hub☆72Updated 5 years ago
- PyMed is a Python library that provides access to PubMed.☆194Updated 2 years ago
- Science Parse parses scientific papers (in PDF form) and returns them in structured form.☆626Updated 5 months ago
- Service for converting and enhancing heterogeneous publisher XML formats into TEI☆44Updated last month
- Unofficial Python client library for Semantic Scholar APIs.☆310Updated 2 weeks ago