AkariAsai/ScholarQABench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/AkariAsai/ScholarQABench)

AkariAsai / ScholarQABench

This repository contains ScholarQABench data and evaluation pipeline.

☆158

Alternatives and similar repositories for ScholarQABench

Users that are interested in ScholarQABench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

AkariAsai / OpenScholar_ExpertEval
View on GitHub
This repository contains expert evaluation interface and data evaluation script for the OpenScholar project.
☆42Nov 19, 2024Updated last year
allenai / openscilm
View on GitHub
Demo for https://arxiv.org/abs/2411.14199
☆20Apr 6, 2026Updated 3 months ago
AkariAsai / OpenScholar
View on GitHub
This repository includes the official implementation of OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs.
☆1,565Aug 13, 2025Updated 11 months ago
bnewm0609 / arxivDIGESTables
View on GitHub
☆18Sep 15, 2025Updated 10 months ago
yikee / ScienceMeter
View on GitHub
ScienceMeter: Tracking Scientific Knowledge Updates in Language Models, COLM 2026
☆17Jun 28, 2025Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
princeton-nlp / LitSearch
View on GitHub
[EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search
☆109Dec 2, 2024Updated last year
ariecattan / SciCo
View on GitHub
Code for the paper SciCo: Hierarchical Cross-Document Coreference for Scientific Concepts (AKBC 2021). https://openreview.net/forum?id=OF…
☆30Oct 17, 2021Updated 4 years ago
RulinShao / retrieval-scaling
View on GitHub
Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".
☆226Dec 16, 2025Updated 7 months ago
allenai / ai2-scholarqa-lib
View on GitHub
Repo housing the open sourced code for the ai2 scholar qa app and also the corresponding library
☆279Jun 25, 2026Updated last month
SALT-NLP / search_privacy_risk
View on GitHub
Code for the paper "Searching Privacy Risks in Multi-Agent Systems via Simulation"
☆24Oct 13, 2025Updated 9 months ago
BiostateAI / GeneJEPA
View on GitHub
A Predictive World Model of the Transcriptome
☆35Oct 18, 2025Updated 9 months ago
rlresearch / dr-tulu
View on GitHub
Official repository for DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research
☆692Jun 17, 2026Updated last month
paul-rottger / issuebench
View on GitHub
Röttger et al. (2024): "IssueBench: Millions of Realistic Prompts for Measuring Issue Bias in LLM Writing Assistance"
☆17Mar 6, 2026Updated 4 months ago
reka-ai / research-eval
View on GitHub
A benchmark to evaluate search-augmented LLMs
☆17Aug 28, 2025Updated 11 months ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
AstraZeneca / cellatria
View on GitHub
An Agentic AI Framework for Ingestion and Standardization of Single-Cell RNA-seq Data Analysis
☆80May 19, 2026Updated 2 months ago
multilexsum / dataset
View on GitHub
Multi-LexSum is an abstractive summarization dataset for US Civil Rights Lawsuits
☆23Dec 15, 2022Updated 3 years ago
UKPLab / PeerQA
View on GitHub
Code and Data for PeerQA: A Scientific Question Answering Dataset from Peer Reviews, NAACL 2025 https://aclanthology.org/2025.naacl-long.…
☆15Jun 1, 2026Updated last month
facebookresearch / reconsider
View on GitHub
ReConsider is a re-ranking model that re-ranks the top-K (passage, answer-span) predictions of an Open-Domain QA Model like DPR (Karpukhi…
☆50Apr 26, 2021Updated 5 years ago
phipsonlab / SuperCellCyto
View on GitHub
SuperCell for Cytometry data
☆13Oct 14, 2025Updated 9 months ago
cquzys / SANGO
View on GitHub
The official implementation for "SANGO".
☆11Mar 17, 2024Updated 2 years ago
genbio-ai / foundation-models-perturbation
View on GitHub
Code for "Foundation Models Improve Perturbation Response Prediction"
☆27Apr 16, 2026Updated 3 months ago
Lotfollahi-lab / Perturbgen
View on GitHub
☆25Jul 22, 2026Updated last week
zoranmedic / mdcr
View on GitHub
Benchmark dataset for the evaluation of scientific article representations on the task of citation recommendation across various scientif…
☆12Oct 21, 2022Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Cen-Jipeng-SUDA / SQLFixAgent
View on GitHub
The official implementation of our work SQLFixAgent: Towards Semantic-Accurate Text-to-SQL Parsing via Consistency-Enhanced Multi-Agent C…
☆27May 2, 2025Updated last year
allenai / agent-baselines
View on GitHub
☆150Updated this week
BGIResearch / RegFormer
View on GitHub
☆20Nov 12, 2025Updated 8 months ago
dhglab / Broad-transcriptomic-dysregulation-across-the-cerebral-cortex-in-ASD
View on GitHub
Code accompanying the manuscript "Broad transcriptomic dysregulation across the cerebral cortex in ASD".
☆13Aug 10, 2022Updated 3 years ago
RUCKBReasoning / DPO_Text2SQL
View on GitHub
[ACL 2025] Uncovering the Impact of Chain-of-Thought Reasoning for Direct Preference Optimization: Lessons from Text-to-SQL
☆16Oct 9, 2025Updated 9 months ago
Websail-NU / CODAH
View on GitHub
Repository for the CODAH dataset
☆22Oct 29, 2022Updated 3 years ago
deadshot465 / novelcrafter-mcp
View on GitHub
An experimental desktop client for using Claude Desktop's MCP with Novelcrafter codices.
☆11Dec 3, 2024Updated last year
Gen-Verse / CURE
View on GitHub
[NeurIPS 2025 Spotlight] Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning
☆167Sep 19, 2025Updated 10 months ago
KatherLab / llmaixweb
View on GitHub
A web interface for the LLMAIx framework. Information Extraction.
☆20Jul 22, 2026Updated last week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
lueckenlab / single-cell-papers-trends
View on GitHub
Trends in single cell papers
☆17Sep 23, 2024Updated last year
theislab / cellrank2_reproducibility
View on GitHub
CellRank 2's reproducibility repository.
☆15Sep 4, 2024Updated last year
Anikethh / ResearchGym
View on GitHub
Benchmark and execution environment for evaluating LLM agents on end-to-end AI Research. [ICLR 2026]
☆35May 31, 2026Updated last month
felipemaiapolo / prompteval
View on GitHub
Efficient multi-prompt evaluation of LLMs
☆33Dec 6, 2024Updated last year
dginev / ar5iv-css
View on GitHub
Some CSS experiments for arXiv HTML documents converted via latexml
☆20Jul 5, 2026Updated 3 weeks ago
guestrin-lab / deepscholar
View on GitHub
build and benchmark deep research
☆245Mar 28, 2026Updated 4 months ago
google / spiqa
View on GitHub
Code release for "SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers" [NeurIPS D&B, 2024]
☆76Jan 13, 2025Updated last year