allenai / papermage
library supporting NLP and CV research on scientific papers
☆771Updated 6 months ago
Alternatives and similar repositories for papermage
Users that are interested in papermage are comparing it to the libraries listed below
Sorting:
- RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.☆445Updated last week
- Guideline following Large Language Model for Information Extraction☆371Updated 6 months ago
- Parsers for scientific papers (PDF2JSON, TEX2JSON, JATS2JSON)☆398Updated last year
- Easily embed, cluster and semantically label text datasets☆534Updated last year
- Python PDF parser for scientific publications: content and figures☆403Updated last year
- Generative Representational Instruction Tuning☆634Updated 2 months ago
- Train Models Contrastively in Pytorch☆702Updated last month
- Data and tools for generating and inspecting OLMo pre-training data.☆1,214Updated this week
- S2ORC: The Semantic Scholar Open Research Corpus: https://www.aclweb.org/anthology/2020.acl-main.447/☆925Updated last year
- Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy☆1,153Updated last month
- Is ChatGPT Good at Search? LLMs as Re-Ranking Agent [EMNLP 2023 Outstanding Paper Award]☆601Updated last year
- [EMNLP 2023] Enabling Large Language Models to Generate Text with Citations. Paper: https://arxiv.org/abs/2305.14627☆484Updated 7 months ago
- TF-ID: Table/Figure IDentifier for academic papers☆232Updated 10 months ago
- A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.☆773Updated 2 months ago
- ☆360Updated last year
- Evaluation suite for LLMs☆348Updated last month
- Python client for GROBID Web services☆325Updated 2 months ago
- Efficient Retrieval Augmentation and Generation Framework☆1,536Updated 4 months ago
- UniTable: Towards a Unified Table Foundation Model☆467Updated 11 months ago
- Automated Evaluation of RAG Systems☆590Updated last month
- ☆515Updated 5 months ago
- Evaluate your LLM's response with Prometheus and GPT4 💯☆938Updated 3 weeks ago
- ToRA is a series of Tool-integrated Reasoning LLM Agents designed to solve challenging mathematical reasoning problems by interacting wit…☆1,067Updated last year
- A data set based on all arXiv publications, pre-processed for NLP, including structured full-text and citation network☆289Updated 7 months ago
- Retrieve, Read and LinK: Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget (ACL 2024)☆419Updated 7 months ago
- In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.☆421Updated last year
- DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤☆1,015Updated 3 months ago
- Extend existing LLMs way beyond the original training length with constant memory usage, without retraining☆698Updated last year
- Contriever: Unsupervised Dense Information Retrieval with Contrastive Learning☆735Updated 2 years ago
- This repository includes the official implementation of OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs.☆681Updated last month