allenai/agent-baselines

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/allenai/agent-baselines)

allenai / agent-baselines

☆150

Alternatives and similar repositories for agent-baselines

Users that are interested in agent-baselines are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

allenai / asta-bench
View on GitHub
☆124Updated this week
allenai / asta-paper-finder
View on GitHub
frozen-in-time version of our Paper Finder agent for reproducing evaluation results
☆244Mar 17, 2026Updated 4 months ago
allenai / discoverybench
View on GitHub
Discovering Data-driven Hypotheses in the Wild
☆157Jun 9, 2025Updated last year
allenai / ai2-scholarqa-lib
View on GitHub
Repo housing the open sourced code for the ai2 scholar qa app and also the corresponding library
☆279Jun 25, 2026Updated last month
allenai / neurodiscoverybench
View on GitHub
☆22Jan 29, 2026Updated 6 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
allenai / fluid-benchmarking
View on GitHub
Fluid Language Model Benchmarking
☆29Sep 16, 2025Updated 10 months ago
allenai / autodiscovery-neurips
View on GitHub
Official code for NeurIPS 2025 paper "AutoDiscovery: Open-ended Scientific Discovery via Bayesian Surprise"
☆196Jul 2, 2026Updated 3 weeks ago
allenai / asta-theorizer
View on GitHub
Staging area for a public release of Theorizer
☆170May 2, 2026Updated 2 months ago
ChicagoHAI / future-of-science-roadmap
View on GitHub
☆21Oct 29, 2025Updated 8 months ago
allenai / prescience
View on GitHub
PreScience: A Benchmark for Forecasting Scientific Contributions
☆32Updated this week
allenai / AskOlmo
View on GitHub
☆15Nov 19, 2025Updated 8 months ago
Anikethh / IRIS-Interactive-Research-Ideation-System
View on GitHub
A platform for Interactive AI-assisted Hypothesis Generation [ACL 2025]
☆36May 10, 2026Updated 2 months ago
allenai / codescientist
View on GitHub
CodeScientist: An automated scientific discovery system for code-based experiments
☆345Mar 25, 2026Updated 4 months ago
allenai / artifact-linker
View on GitHub
ArtifactLinker: Linking Scientific Artifacts for Automatic State-of-the-Art Discovery
☆40Jul 20, 2026Updated last week
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
hbiaou / openalex-mcp
View on GitHub
An MCP server designed for academic literature research using the OpenAlex free API.
☆15Jun 25, 2025Updated last year
SALT-NLP / multi-value
View on GitHub
Complete set of English dialect transformation rules and evaluation code
☆16Jun 7, 2024Updated 2 years ago
orevaahia / magnet-tokenization
View on GitHub
☆11Mar 17, 2026Updated 4 months ago
zoranmedic / mdcr
View on GitHub
Benchmark dataset for the evaluation of scientific article representations on the task of citation recommendation across various scientif…
☆12Oct 21, 2022Updated 3 years ago
anguyen8 / peeb
View on GitHub
[NAACL 2024] Part-based, explainable and editable fine-grained image classifier that allows users to define a species in text
☆14Sep 19, 2025Updated 10 months ago
commoncrawl / cc-citations
View on GitHub
Scientific articles using or citing Common Crawl data
☆29Jul 8, 2026Updated 3 weeks ago
ajyl / mech_int_othelloGPT
View on GitHub
☆10Nov 6, 2024Updated last year
stanford-oval / Lemonade
View on GitHub
LEMONADE: A Large Multilingual Expert-Annotated Abstractive Event Dataset for the Real World
☆22Nov 8, 2025Updated 8 months ago
yale-nlp / SciArena
View on GitHub
Analysis code for Neurips 2025 paper "SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks"
☆56Aug 6, 2025Updated 11 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
allenai / bolmo-core
View on GitHub
Code for Bolmo: Byteifying the Next Generation of Language Models
☆136Jul 6, 2026Updated 3 weeks ago
allenai / recoma
View on GitHub
Reasoning by Communicating with Agents
☆30Apr 29, 2025Updated last year
cisnlp / bias-in-nlp
View on GitHub
Literature overview: gender bias in natural language processing
☆12Jan 26, 2021Updated 5 years ago
au-clan / cachesaver
View on GitHub
☆30Feb 11, 2026Updated 5 months ago
allenai / safety-eval
View on GitHub
A simple evaluation of generative language models and safety classifiers.
☆105Jun 16, 2026Updated last month
apple / ml-ppg-age-analysis
View on GitHub
☆16Aug 20, 2025Updated 11 months ago
apple / ml-interactive-data-augmentation
View on GitHub
Interactive Data Augmentation (CHI 2025)
☆34Mar 20, 2025Updated last year
joehoover / cog-poet-vicuna-13b
View on GitHub
An instruction tuned large language model with extra support for poetry and verse generation
☆25Jun 5, 2023Updated 3 years ago
reka-ai / research-eval
View on GitHub
A benchmark to evaluate search-augmented LLMs
☆17Aug 28, 2025Updated 11 months ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
schwartz-lab-NLP / Tokens2Words
View on GitHub
☆16Apr 2, 2025Updated last year
moaraio / SS-self-hosting
View on GitHub
This is a public repository to enable researchers to begin their journey of self-hosting data from Semantic Scholar.
☆46Nov 7, 2024Updated last year
allenai / EmbeddingRecycling
View on GitHub
Embedding Recycling for Language models
☆38Jul 11, 2023Updated 3 years ago
allenai / olmo-cookbook
View on GitHub
OLMost every training recipe you need to perform data interventions with the OLMo family of models.
☆72Jul 21, 2026Updated last week
allenai / FlexOlmo
View on GitHub
Code and training scripts for FlexOlmo
☆151Apr 20, 2026Updated 3 months ago
allenai / peS2o
View on GitHub
Pretraining Efficiently on S2ORC!
☆187Oct 23, 2024Updated last year
harbor-framework / harbor-cookbook
View on GitHub
Realistic examples of building evals and optimizing agents with Harbor
☆148Apr 23, 2026Updated 3 months ago