yale-nlp/SciArena

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yale-nlp/SciArena)

yale-nlp / SciArena

Analysis code for Neurips 2025 paper "SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks"

☆56

Alternatives and similar repositories for SciArena

Users that are interested in SciArena are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

tangzhy / RealCritic
View on GitHub
☆15Jan 27, 2025Updated last year
dgarcia-eu / SocialMediaDataAnalysis
View on GitHub
Online materials for Social Media Data Analysis at the University of Konstanz
☆10Oct 13, 2025Updated 9 months ago
yale-nlp / Bright-Pro
View on GitHub
Data and code for ACL 2026 Paper "Rethinking Reasoning-Intensive Retrieval: Evaluating and Advancing Retrievers in Agentic Search Systems…
☆19Apr 30, 2026Updated 2 months ago
Fu-Fu-Fu-Fu / VideoKR
View on GitHub
[ICML 26 Spotlight] Code for paper "VideoKR: Towards Knowledge- and Reasoning-Intensive Video Understanding"
☆19Jun 5, 2026Updated last month
yale-nlp / FinanceMath
View on GitHub
Data and Code for the paper "FinanceMath: Knowledge-Intensive Math Reasoning in Finance Domains"
☆25Jul 14, 2026Updated 2 weeks ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
bnewm0609 / arxivDIGESTables
View on GitHub
☆18Sep 15, 2025Updated 10 months ago
ChengpengLi1003 / CoRT
View on GitHub
☆72Oct 23, 2025Updated 9 months ago
XinyuanLu00 / SciTab
View on GitHub
The project page for "SCITAB: A Challenging Benchmark for Compositional Reasoning and Claim Verification on Scientific Tables"
☆23Dec 21, 2023Updated 2 years ago
yale-nlp / QTSumm
View on GitHub
Data and Code for EMNLP 2023 paper "QTSumm: Query-Focused Summarization over Tabular Data"
☆23Mar 29, 2024Updated 2 years ago
strangeloopcanon / ParaLLM
View on GitHub
CLI that queries multiple language models in parallel using prompts from a CSV file
☆28Sep 24, 2025Updated 10 months ago
huggingface / feel
View on GitHub
☆15May 26, 2026Updated 2 months ago
metehan777 / alsoasked-mcp
View on GitHub
AlsoAsked MCP Server
☆15Jun 9, 2025Updated last year
glaive-ai / reflection_70b_training
View on GitHub
☆17Feb 12, 2025Updated last year
allenai / autodiscovery-neurips
View on GitHub
Official code for NeurIPS 2025 paper "AutoDiscovery: Open-ended Scientific Discovery via Bayesian Surprise"
☆196Jul 2, 2026Updated 3 weeks ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
metehan777 / moz-mcp
View on GitHub
Moz SEO Tool MCP
☆15Jun 13, 2025Updated last year
Table-R1 / Table-R1
View on GitHub
[EMNLP 2025] Code for paper "Table-R1: Inference-Time Scaling for Table Reasoning"
☆32Jun 3, 2025Updated last year
dnakov / cc-gh
View on GitHub
☆15Jun 17, 2025Updated last year
RAG-Gym / RAG-Gym
View on GitHub
Official repository for RAG-Gym
☆126Jul 14, 2026Updated 2 weeks ago
NousResearch / smc-inference-server
View on GitHub
☆34Dec 15, 2025Updated 7 months ago
allenai / chime
View on GitHub
Repository containing dataset, models and code associated with the CHIME project
☆18Aug 22, 2024Updated last year
allenai / asta-paper-finder
View on GitHub
frozen-in-time version of our Paper Finder agent for reproducing evaluation results
☆244Mar 17, 2026Updated 4 months ago
Essential-AI / eai-taxonomy
View on GitHub
☆59Aug 19, 2025Updated 11 months ago
nexusflowai / NexusBench
View on GitHub
Nexusflow function call, tool use, and agent benchmarks.
☆29Dec 13, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Kwai-Klear / Klear1.0
View on GitHub
☆19Sep 7, 2025Updated 10 months ago
ViNeek / FLOW
View on GitHub
P2P Live Video Streaming
☆11Dec 9, 2013Updated 12 years ago
kuzudb / dspy-kuzu-demo
View on GitHub
Intro to using DSPy with Kuzu to enrich the data within the Nobel Laureate mentorship network
☆16Sep 16, 2025Updated 10 months ago
inclusionAI / Ring-V2
View on GitHub
Ring-V2 is a reasoning MoE LLM provided and open-sourced by InclusionAI.
☆98Oct 23, 2025Updated 9 months ago
yilunzhao / Awsome-Table-Reasoning
View on GitHub
A comprehensive paper list of Reasoning over Tables.
☆30Nov 6, 2022Updated 3 years ago
davisrbr / conjectures-arxiv
View on GitHub
OpenConjecture, a dataset of mathematics conjectures pulled from papers published to the ArXiv
☆16Jul 12, 2026Updated 2 weeks ago
stockeh / mlx-grokking
View on GitHub
Grokking on modular arithmetic in less than 150 epochs in MLX
☆15Oct 24, 2024Updated last year
Kwai-Klear / RLEP
View on GitHub
RL with Experience Replay
☆58Jul 27, 2025Updated last year
lfy79001 / Awesome-Table-QA
View on GitHub
A comprehensive paper list of Table-based Question Answering.
☆40Sep 1, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
astonishedrobo / tabulens
View on GitHub
🔍📃 LLM-powered PDF Table Extractor
☆19Jun 26, 2025Updated last year
SakanaAI / natural_niches
View on GitHub
The code repository of the paper: Competition and Attraction Improve Model Fusion
☆170Aug 25, 2025Updated 11 months ago
facebookresearch / UNLU
View on GitHub
Code for the paper "UnNatural Language Inference" to appear at ACL 2021 (Long Paper)
☆37Aug 31, 2021Updated 4 years ago
allenai / noncompliance
View on GitHub
This repository contains data, code and models for contextual noncompliance.
☆26Jul 18, 2024Updated 2 years ago
eyra / port
View on GitHub
[DEPRECATED] Scientific donation tool for digital trace data
☆24Aug 4, 2025Updated 11 months ago
yilunzhao / RobuT
View on GitHub
Data and code for ACL 2023 paper "RobuT: A Systematic Study of Table QA Robustness Against Human-Annotated Adversarial Perturbations"
☆15Feb 8, 2024Updated 2 years ago
fff1969-code / rmu-fix
View on GitHub
☆10May 20, 2023Updated 3 years ago