allenai / pdf-component-libraryLinks
☆82Updated last year
Alternatives and similar repositories for pdf-component-library
Users that are interested in pdf-component-library are comparing it to the libraries listed below
Sorting:
- This is a public repository to enable researchers to begin their journey of self-hosting data from Semantic Scholar.☆44Updated last year
- Parsers for scientific papers (PDF2JSON, TEX2JSON, JATS2JSON)☆451Updated last year
- Public space for the user community of Semantic Scholar APIs to share scripts, report issues, and make suggestions.☆245Updated 9 months ago
- Edu-ConvoKit: An Open-Source Framework for Education Conversation Data☆101Updated 7 months ago
- Open Access PDF harvester, metadata aggregator and full-text ingester☆63Updated last year
- LLM Comparator is an interactive data visualization tool for evaluating and analyzing LLM responses side-by-side, developed by the PAIR t…☆497Updated 9 months ago
- Interact with the Deep Search platform for new knowledge explorations and discoveries☆219Updated 9 months ago
- Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM (CHI 2024 paper). LLooM automatically surfaces high-l…☆141Updated 5 months ago
- library supporting NLP and CV research on scientific papers☆784Updated last year
- ☆106Updated 3 weeks ago
- 📄 ⚙️ ETL processes for medical and scientific papers☆401Updated 3 months ago
- ☆99Updated last month
- ☆199Updated 2 weeks ago
- SUQL: Conversational Search over Structured and Unstructured Data with LLMs☆291Updated 3 weeks ago
- A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GRO…☆52Updated 8 months ago
- 🗺️ Data Cleaning and Textual Data Visualization 🗺️☆191Updated 6 months ago
- ☆46Updated 3 months ago
- Extract structured text from pdfs quickly☆624Updated 5 months ago
- Python client for GROBID Web services☆376Updated this week
- Benchmarking PDF libraries☆315Updated 4 months ago
- All the OpenAlex API endpoints that are backed by Elasticsearch☆37Updated last week
- In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.☆442Updated last year
- SciRepEval benchmark training and evaluation scripts☆76Updated last week
- Dataset and annotations for ASSETS 2022 publication☆12Updated 3 years ago
- Python API for https://vespa.ai, the open big data serving engine☆147Updated last week
- Attribute (or cite) statements generated by LLMs back to in-context information.☆300Updated last year
- Get answers to research questions from 200M+ papers. Link to demo -☆206Updated 2 weeks ago
- A python implementation of priompt - a neat way of managing context from diverse sources for LLM applications.☆113Updated 4 months ago
- ☆100Updated last year
- Simple package to extract text with coordinates from programmatic PDFs☆214Updated 2 weeks ago