allenai / papermage
library supporting NLP and CV research on scientific papers
☆753Updated 4 months ago
Alternatives and similar repositories for papermage:
Users that are interested in papermage are comparing it to the libraries listed below
- Guideline following Large Language Model for Information Extraction☆355Updated 5 months ago
- Train Models Contrastively in Pytorch☆666Updated last month
- Python PDF parser for scientific publications: content and figures☆399Updated last year
- Generative Representational Instruction Tuning☆612Updated last week
- Evaluate your LLM's response with Prometheus and GPT4 💯☆885Updated last week
- ☆358Updated last year
- RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.☆418Updated last week
- Data and tools for generating and inspecting OLMo pre-training data.☆1,170Updated 2 weeks ago
- Evaluation suite for LLMs☆339Updated 3 months ago
- Easily embed, cluster and semantically label text datasets☆516Updated 11 months ago
- [EMNLP 2023] Enabling Large Language Models to Generate Text with Citations. Paper: https://arxiv.org/abs/2305.14627☆478Updated 5 months ago
- Automated Evaluation of RAG Systems☆563Updated 4 months ago
- Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy☆1,074Updated this week
- Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…☆3,328Updated last month
- Parsers for scientific papers (PDF2JSON, TEX2JSON, JATS2JSON)☆359Updated 11 months ago
- [NeurIPS 2024 Spotlight] Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models☆613Updated this week
- A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.☆754Updated 3 weeks ago
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI☆1,374Updated 11 months ago
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.☆1,345Updated last week
- SpanMarker for Named Entity Recognition☆423Updated 2 months ago
- This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai,…☆2,009Updated 10 months ago
- Open-source tool to visualise your RAG 🔮☆1,119Updated 2 months ago
- ☆176Updated last week
- UniTable: Towards a Unified Table Foundation Model☆448Updated 9 months ago
- awesome synthetic (text) datasets☆265Updated 4 months ago
- ☆1,982Updated 10 months ago
- Extract structured text from pdfs quickly☆444Updated 3 weeks ago
- All-in-one text de-duplication☆664Updated 10 months ago
- Fine-Tuning Embedding for RAG with Synthetic Data☆489Updated last year
- multimodal document analysis☆164Updated 9 months ago