allenai / pdf-component-library
☆71Updated last year
Alternatives and similar repositories for pdf-component-library:
Users that are interested in pdf-component-library are comparing it to the libraries listed below
- SciRepEval benchmark training and evaluation scripts☆73Updated 11 months ago
- multimodal document analysis☆164Updated 10 months ago
- This is a public repository to enable researchers to begin their journey of self-hosting data from Semantic Scholar.☆42Updated 5 months ago
- ☆34Updated last year
- ☆87Updated 11 months ago
- Pretraining Efficiently on S2ORC!☆161Updated 6 months ago
- Logical structure analysis for visually structured documents☆89Updated 2 years ago
- Semantic search engine indexing 110 million academic publications☆80Updated last month
- Factored Cognition Primer: How to write compositional language model programs☆48Updated 2 years ago
- Open Access PDF harvester, metadata aggregator and full-text ingester☆60Updated 11 months ago
- A spaCy wrapper for GliNER☆112Updated 2 months ago
- Parsers for scientific papers (PDF2JSON, TEX2JSON, JATS2JSON)☆363Updated last year
- Public space for the user community of Semantic Scholar APIs to share scripts, report issues, and make suggestions.☆222Updated 3 months ago
- Get answers to research questions from 200M+ papers. Link to demo -☆206Updated last year
- A Benchmark of PDF Information Extraction Tools using a Multi-Task and Multi-Domain Evaluation Framework for Academic Documents☆23Updated 2 years ago
- Viewer for the structure extracted by Grobid on PDF documents☆48Updated 2 months ago
- The Semantic Scholar Search Reranker☆108Updated 4 years ago
- PDF parser powered by grobid☆26Updated 9 months ago
- Edu-ConvoKit: An Open-Source Framework for Education Conversation Data☆92Updated last week
- A high performance bibliographic information service: https://biblio-glutton.readthedocs.io☆137Updated 7 months ago
- The guts for computing data for OpenAlex. For more, see https://openalex.org/.☆134Updated 3 weeks ago
- ☆93Updated 11 months ago
- ☆33Updated last year
- Interact with the Deep Search platform for new knowledge explorations and discoveries☆192Updated 3 months ago
- Incorporating VIsual LAyout Structures for Scientific Text Classification☆175Updated 2 years ago
- library supporting NLP and CV research on scientific papers☆764Updated 5 months ago
- A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GRO…☆50Updated last month
- Service for converting and enhancing heterogeneous publisher XML formats into TEI☆54Updated 7 months ago
- Code and data for the paper 'The impact of founder personalities on startup success'☆14Updated 11 months ago
- 📝 Reference-Free automatic summarization evaluation with potential hallucination detection☆100Updated last year