allenai / papermageLinks
library supporting NLP and CV research on scientific papers
☆788Updated last year
Alternatives and similar repositories for papermage
Users that are interested in papermage are comparing it to the libraries listed below
Sorting:
- Parsers for scientific papers (PDF2JSON, TEX2JSON, JATS2JSON)☆456Updated last year
- Guideline following Large Language Model for Information Extraction☆426Updated last year
- A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.☆819Updated 6 months ago
- Python PDF parser for scientific publications: content and figures☆451Updated last year
- ☆373Updated 2 years ago
- Easily embed, cluster and semantically label text datasets☆592Updated last year
- Data and tools for generating and inspecting OLMo pre-training data.☆1,404Updated 3 months ago
- DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤☆1,093Updated last year
- Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard☆568Updated 3 months ago
- Generative Representational Instruction Tuning☆686Updated 7 months ago
- Forward-Looking Active REtrieval-augmented generation (FLARE)☆667Updated 2 years ago
- [EMNLP 2023] Enabling Large Language Models to Generate Text with Citations. Paper: https://arxiv.org/abs/2305.14627☆510Updated last year
- S2ORC: The Semantic Scholar Open Research Corpus: https://www.aclweb.org/anthology/2020.acl-main.447/☆1,009Updated last year
- Python client for GROBID Web services☆388Updated last month
- Automated Evaluation of RAG Systems☆689Updated 10 months ago
- Open-source tool to visualise your RAG 🔮☆1,216Updated last year
- RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.☆575Updated this week
- UniTable: Towards a Unified Table Foundation Model☆522Updated last year
- Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy☆1,477Updated last week
- FacTool: Factuality Detection in Generative AI☆912Updated last year
- 🦙 Integrating LLMs into structured NLP pipelines☆1,362Updated last year
- SGPT: GPT Sentence Embeddings for Semantic Search☆873Updated last year
- ToRA is a series of Tool-integrated Reasoning LLM Agents designed to solve challenging mathematical reasoning problems by interacting wit…☆1,113Updated last year
- Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'☆1,648Updated 2 months ago
- All-in-one text de-duplication☆742Updated last month
- String-to-String Algorithms for Natural Language Processing☆563Updated 2 weeks ago
- Train Models Contrastively in Pytorch☆775Updated 10 months ago
- SpanMarker for Named Entity Recognition☆465Updated last year
- The official implementation of RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval☆1,572Updated last year
- SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models☆601Updated last year