lmarena/search-arena

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/lmarena/search-arena)

lmarena / search-arena

⚔️ [ICLR 2026] Official code of "Search Arena: Analyzing Search-Augmented LLMs".

☆58

Alternatives and similar repositories for search-arena

Users that are interested in search-arena are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

fouratifares / ECP
View on GitHub
Every Call is Precious: Global Optimization of Black-Box Functions with Unknown Lipschitz Constants
☆16Apr 23, 2026Updated 3 months ago
inclusionAI / GroveMoE
View on GitHub
☆24Aug 20, 2025Updated 11 months ago
yale-nlp / SciArena
View on GitHub
Analysis code for Neurips 2025 paper "SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks"
☆56Aug 6, 2025Updated 11 months ago
sheng-z / JOCI
View on GitHub
Ordinal Common-sense Inference
☆27May 15, 2018Updated 8 years ago
kuzudb / dspy-kuzu-demo
View on GitHub
Intro to using DSPy with Kuzu to enrich the data within the Nobel Laureate mentorship network
☆16Sep 16, 2025Updated 10 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
SamsungSAILMontreal / ByteCraft
View on GitHub
☆42Apr 9, 2025Updated last year
FractalAIResearchLabs / Fathom-DeepResearch
View on GitHub
Fathom-DeepResearch: Unlocking Long Horizon Information Retrieval And Synthesis For SLMs
☆62Oct 7, 2025Updated 9 months ago
mnluzimu / WebGen-Bench
View on GitHub
☆54Jul 10, 2026Updated 2 weeks ago
nju-websoft / TSQA
View on GitHub
TSQA: Tabular Scenario Based Question Answering (AAAI 2021)
☆18Dec 17, 2020Updated 5 years ago
UMass-Embodied-AGI / BudgetGuidance
View on GitHub
[ACL'26 Findings] Steering LLM Thinking with Budget Guidance
☆33Feb 19, 2026Updated 5 months ago
IBM / rag-chunking-techniques
View on GitHub
This repository contains the code for implementation of RAG approach with company policies data, evaluation of RAG solution and smart chu…
☆16Sep 18, 2025Updated 10 months ago
anonymous-sushi-armadillo / fast_is_better_than_free_imagenet
View on GitHub
☆10Sep 25, 2019Updated 6 years ago
TIGER-AI-Lab / MEGA-Bench
View on GitHub
This repo contains the code for "MEGA-Bench Scaling Multimodal Evaluation to over 500 Real-World Tasks" [ICLR 2025]
☆81Jul 1, 2025Updated last year
texttron / BrowseComp-Plus
View on GitHub
BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent (ACL 2026 Main)
☆319May 28, 2026Updated 2 months ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
reka-ai / research-eval
View on GitHub
A benchmark to evaluate search-augmented LLMs
☆17Aug 28, 2025Updated 11 months ago
microsoft / ConstrainedReasoner
View on GitHub
☆13Aug 26, 2024Updated last year
thunlp / Seq2Seq-Prompt
View on GitHub
Source code for COLING 2022 paper "Automatic Label Sequence Generation for Prompting Sequence-to-sequence Models"
☆24Sep 21, 2022Updated 3 years ago
sparkle-reasoning / sparkle
View on GitHub
[NeurIPS'25] Beyond Accuracy: Dissecting Mathematical Reasoning for LLMs Under Reinforcement Learning
☆16Dec 12, 2025Updated 7 months ago
xiaofengShi / SPAR
View on GitHub
☆27Jul 23, 2025Updated last year
hanningzhang / ER-PRM
View on GitHub
☆20Dec 14, 2024Updated last year
uwsampl / paper-agents
View on GitHub
☆13Dec 9, 2024Updated last year
yeyimilk / LLMGeo
View on GitHub
LLMGeo: Benchmarking Large Language Models on Image Geolocation In-the-wild
☆16Oct 31, 2024Updated last year
pwnhyo / T-MAP
View on GitHub
☆18Mar 25, 2026Updated 4 months ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
Taishi-N324 / Awesome-RL-Reasoning
View on GitHub
Awesome-RL-Reasoning
☆17Updated this week
Tencent-Hunyuan / Hunyuan-4B
View on GitHub
☆16Aug 5, 2025Updated 11 months ago
METR / Measuring-Early-2025-AI-on-Exp-OSS-Devs
View on GitHub
Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity: https://metr.org/blog/2025-07-10-early-2025-ai-e…
☆16Feb 23, 2026Updated 5 months ago
slp-rl / SpokenStoryCloze
View on GitHub
A spoken version of the textual story cloze benchmark
☆22Aug 6, 2023Updated 2 years ago
facebookresearch / ReasonIR
View on GitHub
Official repository for paper "ReasonIR Training Retrievers for Reasoning Tasks".
☆230Jul 2, 2026Updated 3 weeks ago
mit-ccc / acl-nuse-personal-narratives
View on GitHub
Exploring aspects of similarity between spoken personal narratives by disentangling them into narrative clause types -- Supplementary inf…
☆12Jul 14, 2020Updated 6 years ago
Helsinki-NLP / OPUS-MT-testsets
View on GitHub
benchmarks for evaluating MT models
☆11Jun 26, 2024Updated 2 years ago
yuhui-zh15 / C3
View on GitHub
Official implementation of "Connect, Collapse, Corrupt: Learning Cross-Modal Tasks with Uni-Modal Data" (ICLR 2024)
☆36Oct 16, 2024Updated last year
flyfj / VisionToolbox
View on GitHub
a set of tools for computer vision processing
☆18Jul 9, 2016Updated 10 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
noa / iur
View on GitHub
Official repository for the EMNLP 2019 paper, "Learning Invariant Representations of Social Media Users."
☆12Aug 27, 2021Updated 4 years ago
osome-iu / Botometer101
View on GitHub
This repository contains the code for the paper "Botometer 101: Social bot practicum for computational social scientists."
☆11Oct 6, 2022Updated 3 years ago
Taishi-N324 / Drop-Upcycling
View on GitHub
[ICLR 2025] Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization
☆25Oct 5, 2025Updated 9 months ago
qiancheng0 / ModelingAgent
View on GitHub
☆23Sep 7, 2025Updated 10 months ago
LG-AI-EXAONE / KMMLU-Pro
View on GitHub
☆16Aug 18, 2025Updated 11 months ago
google-research-datasets / wikifact
View on GitHub
Wikipedia based dataset to train relationship classifiers and fact extraction models
☆25May 25, 2021Updated 5 years ago
jlcmoore / llm-delusions-annotations
View on GitHub
☆15Apr 25, 2026Updated 3 months ago