allenai / papermage
library supporting NLP and CV research on scientific papers
☆704Updated 2 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for papermage
- Easily embed, cluster and semantically label text datasets☆463Updated 7 months ago
- Data and tools for generating and inspecting OLMo pre-training data.☆993Updated this week
- Evaluation suite for LLMs☆312Updated 3 weeks ago
- RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.☆349Updated last week
- Generative Representational Instruction Tuning☆567Updated this week
- A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.☆701Updated 2 months ago
- Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy☆908Updated last week
- Automated Evaluation of RAG Systems☆484Updated 2 weeks ago
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.☆1,095Updated last week
- Efficient Retrieval Augmentation and Generation Framework☆1,341Updated last week
- ☆451Updated 3 weeks ago
- The official implementation of RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval☆966Updated 2 months ago
- ☆333Updated 11 months ago
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆1,634Updated this week
- Official repository for "Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing". Your efficient and high-quality s…☆491Updated 2 weeks ago
- The papers are organized according to our survey: Evaluating Large Language Models: A Comprehensive Survey.☆714Updated 6 months ago
- ☆1,274Updated this week
- ⚡FlashRAG: A Python Toolkit for Efficient RAG Research☆1,346Updated this week
- Guideline following Large Language Model for Information Extraction☆313Updated 3 weeks ago
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.☆2,048Updated this week
- This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai,…☆1,840Updated 5 months ago
- SGPT: GPT Sentence Embeddings for Semantic Search☆852Updated 9 months ago
- ToRA is a series of Tool-integrated Reasoning LLM Agents designed to solve challenging mathematical reasoning problems by interacting wit…☆981Updated 8 months ago
- ☆1,832Updated 6 months ago
- Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'☆1,296Updated last month
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆811Updated this week
- [NeurIPS 2024 Spotlight] Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models☆535Updated 3 weeks ago
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆1,570Updated 3 months ago
- Train Models Contrastively in Pytorch☆546Updated this week
- Contriever: Unsupervised Dense Information Retrieval with Contrastive Learning☆685Updated last year