[EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search
☆109Dec 2, 2024Updated last year
Alternatives and similar repositories for LitSearch
Users that are interested in LitSearch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- TOON as DSPy adapter☆26Feb 1, 2026Updated 4 months ago
- ☆70Mar 30, 2025Updated last year
- [ICLR 2025] "GraphEval: A Lightweight Graph-Based LLM Framework for Idea Evaluation", Tao Feng, Yihang Sun, Jiaxuan You☆18Mar 18, 2025Updated last year
- ☆20Mar 4, 2025Updated last year
- This repository helps you evaluate your models on the FreshStack benchmark!☆34Dec 9, 2025Updated 6 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆14Aug 25, 2021Updated 4 years ago
- Source code of "Leaky Thoughts: Large Reasoning Models Are Not Private Thinkers" EMNLP 2025☆17Jan 12, 2026Updated 5 months ago
- Codebase for "Linking Surface Facts to Large-Scale Knowledge Graphs" (EMNLP 2023)☆13May 8, 2024Updated 2 years ago
- A Workbench for Autograding Retrieve/Generate Systems☆15Jun 30, 2025Updated 11 months ago
- ☆55Apr 18, 2026Updated 2 months ago
- The repository for papaer "Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs"☆14Dec 16, 2024Updated last year
- Create a QnA bot on a pdf☆16May 27, 2023Updated 3 years ago
- Code and Data for "Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering"☆87Aug 12, 2024Updated last year
- SPRINT Toolkit helps you evaluate diverse neural sparse models easily using a single click on any IR dataset.☆48Jul 25, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- TAMU HELIOS Group PyTen Package☆14Nov 27, 2018Updated 7 years ago
- MASSW is a comprehensive text dataset on Multi-Aspect Summarization of Scientific Workflows. MASSW includes more than 152,000 peer-review…☆22May 16, 2025Updated last year
- AAAI 2024, "Working Memory Capacity of ChatGPT: An Empirical Study".☆15Feb 10, 2025Updated last year
- ☆57Apr 18, 2026Updated 2 months ago
- NoMIRACL: A multilingual hallucination evaluation dataset to evaluate LLM robustness in RAG against first-stage retrieval errors on 18 la…☆27Nov 29, 2024Updated last year
- ☆155Aug 21, 2023Updated 2 years ago
- ACL Paper Lists(machine translation)☆13Mar 23, 2022Updated 4 years ago
- Repo housing the open sourced code for the ai2 scholar qa app and also the corresponding library☆281Mar 19, 2026Updated 2 months ago
- Code for Personalized Large Language Models via Selective Prompt Tuning☆10Jun 26, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- SciRepEval benchmark training and evaluation scripts☆91May 5, 2026Updated last month
- The source code for running LLMs on the AAAR-1.0 benchmark.☆18Apr 5, 2025Updated last year
- The official implementation of our work SQLFixAgent: Towards Semantic-Accurate Text-to-SQL Parsing via Consistency-Enhanced Multi-Agent C…☆24May 2, 2025Updated last year
- FrugalScore is an approach to learn a fixed, low cost version of any expensive NLG metric, while retaining most of its original performan…☆16Sep 21, 2022Updated 3 years ago
- frozen-in-time version of our Paper Finder agent for reproducing evaluation results☆244Mar 17, 2026Updated 3 months ago
- Fast search index for SPLADE sparse retrieval models implemented in Python using Numpy and Numba☆38Oct 16, 2025Updated 8 months ago
- [ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agents☆54Feb 27, 2025Updated last year
- ☆18Dec 2, 2024Updated last year
- Official Github repo for the paper "Evaluating the Evaluation of Diversity in Natural Language Generation"☆21Feb 23, 2021Updated 5 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- This repository contains ScholarQABench data and evaluation pipeline.☆156Aug 13, 2025Updated 10 months ago
- Official implementation of the ACL 2024: Scientific Inspiration Machines Optimized for Novelty☆94Apr 13, 2024Updated 2 years ago
- Semantic Scholar's Author Disambiguation Algorithm & Evaluation Suite☆109May 30, 2026Updated 2 weeks ago
- ☆21Oct 14, 2025Updated 8 months ago
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Aug 25, 2023Updated 2 years ago
- ☆29Feb 2, 2024Updated 2 years ago
- CLIR version of ColBERT☆73May 28, 2026Updated 3 weeks ago