allenai / papermage
library supporting NLP and CV research on scientific papers
☆732Updated 2 months ago
Alternatives and similar repositories for papermage:
Users that are interested in papermage are comparing it to the libraries listed below
- Easily embed, cluster and semantically label text datasets☆494Updated 10 months ago
- Train Models Contrastively in Pytorch☆572Updated 3 weeks ago
- A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.☆733Updated 4 months ago
- Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy☆990Updated 2 weeks ago
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.☆1,261Updated last week
- Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'☆1,408Updated this week
- [EMNLP 2023] Enabling Large Language Models to Generate Text with Citations. Paper: https://arxiv.org/abs/2305.14627☆473Updated 3 months ago
- RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.☆397Updated last week
- Guideline following Large Language Model for Information Extraction☆335Updated 3 months ago
- Generative Representational Instruction Tuning☆588Updated last week
- Data and tools for generating and inspecting OLMo pre-training data.☆1,067Updated 2 weeks ago
- S2ORC: The Semantic Scholar Open Research Corpus: https://www.aclweb.org/anthology/2020.acl-main.447/☆861Updated 9 months ago
- A Comprehensive Benchmark for Document Parsing and Evaluation☆211Updated last week
- Python PDF parser for scientific publications: content and figures☆383Updated 10 months ago
- UniTable: Towards a Unified Table Foundation Model☆413Updated 7 months ago
- ☆489Updated 2 months ago
- Explore and interpret large embeddings in your browser with interactive visualization! 📍☆439Updated 11 months ago
- Evaluation suite for LLMs☆329Updated last month
- ☆347Updated last year
- Parsers for scientific papers (PDF2JSON, TEX2JSON, JATS2JSON)☆354Updated 9 months ago
- Bringing BERT into modernity via both architecture changes and scaling☆1,119Updated last week
- Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard☆508Updated 2 weeks ago
- ⚡FlashRAG: A Python Toolkit for Efficient RAG Research (WWW2025 Resource)☆1,689Updated last week
- Efficient Retrieval Augmentation and Generation Framework☆1,428Updated 2 weeks ago
- Is ChatGPT Good at Search? LLMs as Re-Ranking Agent [EMNLP 2023 Outstanding Paper Award]☆561Updated 10 months ago
- RAGChecker: A Fine-grained Framework For Diagnosing RAG☆719Updated last month
- Framework for enhancing LLMs for RAG tasks using fine-tuning.☆523Updated last month
- [NeurIPS'24] HippoRAG is a novel RAG framework inspired by human long-term memory that enables LLMs to continuously integrate knowledge a…☆1,530Updated 2 weeks ago
- Code for explaining and evaluating late chunking (chunked pooling)☆314Updated last month
- This is a repository of RALM surveys containing a summary of state-of-the-art RAG and other technologies☆193Updated 7 months ago