allenai / pdf-component-library
☆52Updated 9 months ago
Related projects ⓘ
Alternatives and complementary repositories for pdf-component-library
- This is a public repository to enable researchers to begin their journey of self-hosting data from Semantic Scholar.☆32Updated 2 weeks ago
- Python API for https://vespa.ai, the open big data serving engine☆105Updated this week
- A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GRO…☆44Updated 3 months ago
- Factored Cognition Primer: How to write compositional language model programs☆48Updated last year
- Open Access PDF harvester, metadata aggregator and full-text ingester☆55Updated 6 months ago
- ☆82Updated 6 months ago
- 📝 Reference-Free automatic summarization evaluation with potential hallucination detection☆98Updated 10 months ago
- Analyzing and scoring reasoning traces of LLMs☆41Updated 2 months ago
- This repository includes the official implementation of OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs.☆99Updated this week
- Logical structure analysis for visually structured documents☆84Updated 2 years ago
- automatic sentence highlights based on their significance to the document☆181Updated last year
- multimodal document analysis☆160Updated 5 months ago
- 🦦 weasel: A small and easy workflow system☆68Updated 4 months ago
- SciRepEval benchmark training and evaluation scripts☆67Updated 6 months ago
- Completion After Prompt Probability. Make your LLM make a choice☆69Updated 2 weeks ago
- Pretraining Efficiently on S2ORC!☆136Updated 3 weeks ago
- ReLM is a Regular Expression engine for Language Models☆104Updated last year
- 📄 ⚙️ ETL processes for medical and scientific papers☆352Updated 11 months ago
- Implementing the OPRO paper☆14Updated last year
- ☆86Updated 5 months ago
- A spaCy wrapper for GliNER☆91Updated 4 months ago
- 🗺️ Data Cleaning and Textual Data Visualization 🗺️☆146Updated 5 months ago
- Edu-ConvoKit: An Open-Source Framework for Education Conversation Data☆77Updated 3 months ago
- Parsers for scientific papers (PDF2JSON, TEX2JSON, JATS2JSON)☆348Updated 7 months ago
- Social and customizable AI writing assistant! ✍️☆189Updated 4 months ago
- Functional Benchmarks and the Reasoning Gap☆78Updated last month
- Chrome Extension for exploring Hugging Face datasets 🔎☆48Updated 2 months ago
- SemanticFinder - frontend-only live semantic search with transformers.js☆233Updated 2 months ago
- TextGraphs + LLMs + graph ML for entity extraction, linking, ranking, and constructing a lemma graph☆20Updated 8 months ago
- Ref Studio is an open source integrated writing environment for technical writing☆66Updated 10 months ago