allenai / s2-folks
Public space for the user community of Semantic Scholar APIs to share scripts, report issues, and make suggestions.
☆166Updated last week
Related projects: ⓘ
- Get answers to research questions from 200M+ papers. Link to demo -☆203Updated 8 months ago
- ☆78Updated 4 months ago
- Parsers for scientific papers (PDF2JSON, TEX2JSON, JATS2JSON)☆332Updated 5 months ago
- Python PDF parser for scientific publications: content and figures☆328Updated 5 months ago
- A data set based on all arXiv publications, pre-processed for NLP, including structured full-text and citation network☆257Updated 11 months ago
- SciRepEval benchmark training and evaluation scripts☆67Updated 4 months ago
- Incorporating distribution of experts in order to better predict the future discovery of novel scientific connections☆22Updated 10 months ago
- Unofficial Python client library for Semantic Scholar APIs.☆287Updated 2 months ago
- S2ORC: The Semantic Scholar Open Research Corpus: https://www.aclweb.org/anthology/2020.acl-main.447/☆802Updated 4 months ago
- Tools to scrape publication metadata from pubmed, arxiv, medrxiv and chemrxiv.☆211Updated 2 months ago
- Open Access PDF harvester, metadata aggregator and full-text ingester☆54Updated 4 months ago
- Official implementation of the ACL 2024: Scientific Inspiration Machines Optimized for Novelty☆63Updated 5 months ago
- Python client for GROBID Web services☆279Updated 3 weeks ago
- Semantic search engine indexing 95 million academic publications☆76Updated last year
- ☆81Updated 3 months ago
- Dense X Retrieval: What Retrieval Granularity Should We Use?☆120Updated 8 months ago
- Semantic Scholar's Author Disambiguation Algorithm & Evaluation Suite☆87Updated 7 months ago
- A Python library for OpenAlex (openalex.org)☆142Updated last week
- LitLLM: A Toolkit for Scientific Literature Review☆39Updated 5 months ago
- Code for MedCPT, a model for zero-shot biomedical information retrieval.☆124Updated 5 months ago
- A proof of concept to scrape papers from journals☆227Updated 3 months ago
- Edu-ConvoKit: An Open-Source Framework for Education Conversation Data☆70Updated last month
- Code & Prompts for TopicGPT: A Prompt-Based Framework for Topic Modeling☆202Updated 5 months ago
- Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings (EMNLP 2022 paper)☆63Updated last year
- The landscape of biomedical research☆113Updated 5 months ago
- ☆31Updated 8 months ago
- The Harvard USPTO Patent Dataset☆54Updated 9 months ago
- Biomedical Question Answering Datasets.☆71Updated last year
- potato: portable text annotation tool☆285Updated 3 weeks ago
- ☆35Updated last month