π βοΈ ETL processes for medical and scientific papers
β677Dec 7, 2025Updated 4 months ago
Alternatives and similar repositories for paperetl
Users that are interested in paperetl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- π π€ AI for medical and scientific papersβ1,750Jul 9, 2025Updated 9 months ago
- β‘ Local chat assistants with AI superpowersβ336Feb 13, 2026Updated 2 months ago
- π‘ All-in-one AI framework for semantic search, LLM orchestration and language model workflowsβ12,395Apr 8, 2026Updated last week
- A machine learning software for extracting information from scholarly documentsβ4,776Apr 9, 2026Updated last week
- π Semantic search for headlines and story textβ359Sep 23, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Python client for txtaiβ15Mar 20, 2026Updated 3 weeks ago
- π Build knowledge bases for RAGβ32Jul 3, 2025Updated 9 months ago
- COVID-19 Open Research Dataset (CORD-19) Analysisβ57Nov 20, 2022Updated 3 years ago
- Tokenizer for Text to Speech (TTS) modelsβ13Jan 16, 2025Updated last year
- PDF parser powered by grobidβ28Jul 26, 2024Updated last year
- Open Access PDF harvester, metadata aggregator and full-text ingesterβ62May 3, 2024Updated last year
- π Semantic search for developersβ543Sep 23, 2023Updated 2 years ago
- π Automatically annotate papers using LLMsβ411Dec 1, 2025Updated 4 months ago
- ποΈ Highlight text in documentsβ113Feb 13, 2026Updated 2 months ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- π Datasets and models for instruction-tuningβ238Sep 23, 2023Updated 2 years ago
- Open Access PDF harvesterβ42May 3, 2024Updated last year
- This is a public repository to enable researchers to begin their journey of self-hosting data from Semantic Scholar.β47Nov 7, 2024Updated last year
- Magnitude fork that only supports Word2Vec, GloVe and fastText embeddingsβ13Aug 11, 2020Updated 5 years ago
- High accuracy RAG for answering questions from scientific documents with citationsβ8,368Mar 20, 2026Updated 3 weeks ago
- Scientific literature explorer. Runs a Pubmed or Semantic Scholar search and allows user to explore high-level structure of result papersβ51Mar 16, 2026Updated last month
- A full spaCy pipeline and models for scientific/biomedical documents.β1,940Dec 4, 2025Updated 4 months ago
- My Gen AI researchβ11Jun 3, 2024Updated last year
- β18Nov 7, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits β’ AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Python PDF parser for scientific publications: content and figuresβ452Mar 21, 2024Updated 2 years ago
- Science Parse parses scientific papers (in PDF form) and returns them in structured form.β698May 26, 2024Updated last year
- Findpapers: A tool for helping researchers who are looking for related worksβ334Apr 7, 2026Updated last week
- Tools to scrape publications & their metadata from pubmed, arxiv, medrxiv, biorxiv and chemrxiv.β509Mar 17, 2026Updated 3 weeks ago
- Core code for Profiles RNSβ19Dec 18, 2025Updated 3 months ago
- A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GROβ¦β52Mar 17, 2025Updated last year
- MinScIE is an Open Information Extraction system which provides structured knowledge enriched with semantic information about citations.β15Jun 9, 2019Updated 6 years ago
- Software that makes labeling PDFs easy.β428May 13, 2024Updated last year
- h-index-reader is a module that allows you to retrieve author's h-index information from different sources including Google Scholar.β14Oct 22, 2020Updated 5 years ago
- GPU virtual machines on DigitalOcean Gradient AI β’ AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and aβ¦β24,815Updated this week
- π Dehyphenation of broken text (mainly German), i.e., extracted from a PDFβ39Mar 8, 2022Updated 4 years ago
- π Retrieval Augmented Generation (RAG) with txtai. Combine search and LLMs to find insights with your own data.β447Dec 1, 2025Updated 4 months ago
- Fetch Academic Research Papers from different sourcesβ476Dec 24, 2025Updated 3 months ago
- S2ORC: The Semantic Scholar Open Research Corpus: https://www.aclweb.org/anthology/2020.acl-main.447/β1,038Apr 26, 2024Updated last year
- This repo provides the server side code for llmsherpa API to connect. It includes parsers for various file formats.β1,282Mar 28, 2025Updated last year
- Retrieve and extract citations from Crossref dataβ29Mar 11, 2021Updated 5 years ago