moaraio / SS-self-hostingLinks

This is a public repository to enable researchers to begin their journey of self-hosting data from Semantic Scholar.

☆42

Alternatives and similar repositories for SS-self-hosting

Users that are interested in SS-self-hosting are comparing it to the libraries listed below

Sorting:

KarelDO / xmc.dspy
In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.
☆434Updated last year
allenai / pdf-component-library
☆77Updated last year
titipata / scipdf_parser
Python PDF parser for scientific publications: content and figures
☆420Updated last year
allenai / s2orc-doc2json
Parsers for scientific papers (PDF2JSON, TEX2JSON, JATS2JSON)
☆429Updated last year
stanford-oval / suql
SUQL: Conversational Search over Structured and Unstructured Data with LLMs
☆278Updated 2 weeks ago
jackboyla / GLiREL
Generalist and Lightweight Model for Relation Extraction (Extract any relationship types from text)
☆230Updated last month
SapienzaNLP / relik
Retrieve, Read and LinK: Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget (ACL 2024)
☆442Updated last week
neuml / paperetl
📄 ⚙️ ETL processes for medical and scientific papers
☆394Updated this week
allenai / s2-folks
Public space for the user community of Semantic Scholar APIs to share scripts, report issues, and make suggestions.
☆236Updated 6 months ago
allenai / SPECTER2
☆93Updated last year
charlesdedampierre / BunkaTopics
🗺️ Data Cleaning and Textual Data Visualization 🗺️
☆183Updated 2 months ago
MadryLab / context-cite
Attribute (or cite) statements generated by LLMs back to in-context information.
☆268Updated 9 months ago
neuml / annotateai
📝 Automatically annotate papers using LLMs
☆332Updated 3 months ago
allenai / papermage
library supporting NLP and CV research on scientific papers
☆778Updated 8 months ago
lightonai / pylate
Late Interaction Models Training & Retrieval
☆521Updated 2 weeks ago
MoritzLaurer / zeroshot-classifier
Notebooks for training universal 0-shot classifiers on many different tasks
☆133Updated 7 months ago
shauryr / S2QA
Get answers to research questions from 200M+ papers. Link to demo -
☆205Updated last year
epfl-dlab / aiflows
🤖🌊 aiFlows: The building blocks of your collaborative AI
☆259Updated last year
harveyai / biglaw-bench
☆81Updated 8 months ago
Muhtasham / summarization-eval
📝 Reference-Free automatic summarization evaluation with potential hallucination detection
☆101Updated last year
hitz-zentroa / GoLLIE
Guideline following Large Language Model for Information Extraction
☆391Updated 9 months ago
athina-ai / athina-evals
Python SDK for running evaluations on LLM generated responses
☆291Updated 2 months ago
Knowledgator / GLiClass
Generalist and Lightweight Model for Text Classification
☆148Updated last month
whyhow-ai / rule-based-retrieval
The Rule-based Retrieval package is a Python package that enables you to create and manage Retrieval Augmented Generation (RAG) applicati…
☆245Updated 10 months ago
vespa-engine / pyvespa
Python API for https://vespa.ai, the open big data serving engine
☆133Updated this week
colonelwatch / abstracts-search
Semantic search engine indexing 110 million academic publications
☆91Updated 3 weeks ago
taylorai / galactic
data cleaning and curation for unstructured text
☆328Updated last year
jxnl / n-levels-of-rag
☆195Updated last year
allenai / scirepeval
SciRepEval benchmark training and evaluation scripts
☆75Updated last year
DS4SD / deepsearch-toolkit
Interact with the Deep Search platform for new knowledge explorations and discoveries
☆209Updated 6 months ago