allenai / s2-folks
Public space for the user community of Semantic Scholar APIs to share scripts, report issues, and make suggestions.
☆208Updated this week
Alternatives and similar repositories for s2-folks:
Users that are interested in s2-folks are comparing it to the libraries listed below
- Get answers to research questions from 200M+ papers. Link to demo -☆203Updated last year
- Parsers for scientific papers (PDF2JSON, TEX2JSON, JATS2JSON)☆351Updated 9 months ago
- ☆84Updated 8 months ago
- Python PDF parser for scientific publications: content and figures☆382Updated 9 months ago
- A data set based on all arXiv publications, pre-processed for NLP, including structured full-text and citation network☆282Updated 3 months ago
- LitLLM: A Toolkit for Scientific Literature Review☆48Updated 9 months ago
- This is a public repository to enable researchers to begin their journey of self-hosting data from Semantic Scholar.☆37Updated 2 months ago
- SciRepEval benchmark training and evaluation scripts☆71Updated 8 months ago
- Unofficial Python client library for Semantic Scholar APIs.☆343Updated 2 weeks ago
- Python client for GROBID Web services☆301Updated 2 weeks ago
- Open Access PDF harvester, metadata aggregator and full-text ingester☆57Updated 8 months ago
- Official implementation of the ACL 2024: Scientific Inspiration Machines Optimized for Novelty☆70Updated 9 months ago
- Pretraining Efficiently on S2ORC!☆147Updated 2 months ago
- This repository contains ScholarQABench data and evaluation pipeline.☆51Updated last month
- A collection of Jupyter notebooks, each walking you through a common example of bibliometric analysis using scholarly data from the OpenA…☆100Updated 8 months ago
- Code for MedCPT, a model for zero-shot biomedical information retrieval.☆150Updated 9 months ago
- Incorporating distribution of experts in order to better predict the future discovery of novel scientific connections☆27Updated last year
- The Harvard USPTO Patent Dataset☆61Updated last year
- Dense X Retrieval: What Retrieval Granularity Should We Use?☆141Updated last year
- CiteME is a benchmark designed to test the abilities of language models in finding papers that are cited in scientific texts.☆39Updated 2 months ago
- Code/data for MARG (multi-agent review generation)☆36Updated 2 months ago
- Semantic Scholar's Author Disambiguation Algorithm & Evaluation Suite☆91Updated 11 months ago
- A proof of concept to scrape papers from journals☆265Updated 7 months ago
- ☆255Updated last month
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search☆68Updated last month
- A Python library for OpenAlex (openalex.org)☆186Updated this week
- 🗺️ Data Cleaning and Textual Data Visualization 🗺️☆159Updated 6 months ago
- Edu-ConvoKit: An Open-Source Framework for Education Conversation Data☆82Updated 5 months ago
- A Benchmark of PDF Information Extraction Tools using a Multi-Task and Multi-Domain Evaluation Framework for Academic Documents☆22Updated 2 years ago
- multimodal document analysis☆161Updated 7 months ago