pengyuanli / PDFigCapX
https://doi.org/10.1093/bioinformatics/btz228
☆37Updated last year
Related projects ⓘ
Alternatives and complementary repositories for PDFigCapX
- Code for MedCPT, a model for zero-shot biomedical information retrieval.☆138Updated 7 months ago
- Python PDF parser for scientific publications: content and figures☆353Updated 7 months ago
- LitQA Eval: A difficult set of scientific questions that require context of full-text research papers to answer☆35Updated 3 months ago
- This is Clinfo.AI Demo Instruction☆28Updated 2 months ago
- Tools to scrape publication metadata from pubmed, arxiv, medrxiv and chemrxiv.☆238Updated this week
- PDF parsing toolkit for preparing academic text corpus☆49Updated 3 months ago
- For Med-Gemini, we relabeled the MedQA benchmark; this repo includes the annotations and analysis code.☆34Updated 4 months ago
- SciRepEval benchmark training and evaluation scripts☆67Updated 5 months ago
- Python toolkit for NCBI metadata (via eutils) and pubmed article text mining -- official primary repo.☆95Updated 2 months ago
- PMC-Patients: A Large-scale Dataset of Patient Summaries and Relations for Benchmarking Retrieval-based Clinical Decision Support Systems…☆54Updated 10 months ago
- ISMB'24 "Self-BioRAG: Improving Medical Reasoning through Retrieval and Self-Reflection with Retrieval-Augmented Large Language Models"☆40Updated 7 months ago
- Python client for GROBID Web services☆284Updated 2 weeks ago
- Open Access PDF harvester, metadata aggregator and full-text ingester☆54Updated 6 months ago
- Official implementation of the ACL 2024: Scientific Inspiration Machines Optimized for Novelty☆68Updated 6 months ago
- Parsers for scientific papers (PDF2JSON, TEX2JSON, JATS2JSON)☆347Updated 6 months ago
- PubMedQA: A Dataset for Biomedical Research Question Answering☆253Updated last year
- ☆25Updated 9 months ago
- Repository for paper CELLS: A Parallel Corpus for Biomedical Lay Language Generation☆14Updated 7 months ago
- Biomedical Question Answering Datasets.☆78Updated last year
- Biomedical Named Entity Recognition and Normalization of Diseases, Chemicals and Genenetic entity classes through the use of state-of-the…☆101Updated 2 years ago
- A data set based on all arXiv publications, pre-processed for NLP, including structured full-text and citation network☆258Updated last month
- ☆18Updated 11 months ago
- The landscape of biomedical research☆113Updated 6 months ago
- Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings (EMNLP 2022 paper)☆63Updated last year
- Bio relation extraction labeled dataset☆42Updated 2 years ago
- S2APLER: S2 Agglomeration of Papers with Low Error Rate (it's for academic paper clustering)☆14Updated last year
- ChatCell: Facilitating Single-Cell Analysis with Natural Language☆46Updated 8 months ago
- Clinical text summarization by adapting large language models☆120Updated 3 months ago
- ☆82Updated 5 months ago
- PMC-Patients☆82Updated 5 months ago