withmartian/ares

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/withmartian/ares)

withmartian / ares

Agentic Research and Evaluation Suite

☆107

Alternatives and similar repositories for ares

Users that are interested in ares are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

harbor-framework / terminal-bench-challenges
View on GitHub
☆19Jun 18, 2026Updated last month
kanishkg / endless-terminals
View on GitHub
☆134Mar 31, 2026Updated 3 months ago
harbor-framework / awesome-harbor
View on GitHub
A curated list of awesome Harbor ecosystem projects
☆48May 29, 2026Updated last month
modal-labs / multinode-training-guide
View on GitHub
Well documented examples of running distributed training jobs on Modal
☆29Jul 19, 2026Updated last week
josancamon19 / trace
View on GitHub
Trajectory Recording and Capture Environments
☆19Jan 24, 2026Updated 6 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
harbor-framework / harbor
View on GitHub
Framework for evaluating and improving agents
☆3,504Updated this week
samaya-ai / frontier-finance
View on GitHub
Samaya AI's FrontierFinance Benchmark Grader
☆17Jul 16, 2026Updated last week
bgub / tokka-bench
View on GitHub
benchmarks for LLM tokenizers
☆20Mar 25, 2026Updated 4 months ago
Andrewzh112 / AI-Research-Interview-Lab
View on GitHub
☆31Nov 14, 2025Updated 8 months ago
collaborative-agents / coco
View on GitHub
Coco is a proactive co-assistant that connects user workspace with a broader ecosystem of AI agents.
☆23Updated this week
frt03 / mxt_bench
View on GitHub
A System for Morphology-Task Generalization via Unified Representation and Behavior Distillation (ICLR2023)
☆14Feb 3, 2023Updated 3 years ago
abrvkh / explainability_toolkit
View on GitHub
☆14Dec 12, 2024Updated last year
Mercor-Intelligence / apex-evals
View on GitHub
☆15Jun 19, 2026Updated last month
open-thoughts / OpenThoughts-Agent
View on GitHub
Data recipes and robust infrastructure for training AI agents
☆265Updated this week
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
radixark / miles
View on GitHub
Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.
☆1,789Updated this week
argonne-lcf / Megatron-DeepSpeed
View on GitHub
Ongoing research training transformer language models at scale, including: BERT & GPT-2
☆17Mar 11, 2026Updated 4 months ago
yutori-ai / navi-bench
View on GitHub
Navi-Bench: benchmarking web agents on everyday tasks directly on real websites
☆19Updated this week
PrimeIntellect-ai / verifiers
View on GitHub
Our library for RL environments + evals
☆4,400Updated this week
NousResearch / atropos
View on GitHub
Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …
☆1,340Jul 4, 2026Updated 3 weeks ago
PrimeIntellect-ai / prime-rl
View on GitHub
Agentic RL Training at Scale
☆1,724Updated this week
abundant-ai / SWE-gen
View on GitHub
Convert GitHub PRs into Harbor tasks
☆72Jul 13, 2026Updated last week
JoshuaPurtell / SmallBench
View on GitHub
Small, simple agent task environments for training and evaluation
☆20Nov 1, 2024Updated last year
akutuzov / semeval2020
View on GitHub
Lexical semantic change detection shared task at SemEval 2020: UiO-UVA team
☆16Jan 10, 2023Updated 3 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
EleutherAI / deep-ignorance
View on GitHub
☆20Jan 7, 2026Updated 6 months ago
open-tinker / OpenTinker
View on GitHub
OpenTinker is an RL-as-a-Service infrastructure for foundation models
☆676Mar 21, 2026Updated 4 months ago
frt03 / jax_dt
View on GitHub
Minimal Decision Transformer Implementation written in Jax (Flax).
☆18Aug 8, 2022Updated 3 years ago
modaic-ai / microcode
View on GitHub
context-efficient terminal agent powered by an RLM
☆60Feb 7, 2026Updated 5 months ago
aisa-group / PostTrainBench
View on GitHub
Measuring how well CLI agents like Claude Code or Codex CLI can post-train base LLMs on a single H100 GPU in 10 hours
☆467Updated this week
tilde-research / sieve
View on GitHub
Applying SAEs for fine-grained control
☆27Dec 15, 2024Updated last year
sneha-rk / data-recipes
View on GitHub
☆38May 4, 2026Updated 2 months ago
withmartian / llm-adapters
View on GitHub
Package for calling different models with same interface
☆34Jul 21, 2025Updated last year
BlinkDL / Agen
View on GitHub
Agen is a minimalist language for agent loops and state machines.
☆50Mar 30, 2026Updated 3 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
G4brym / aletria
View on GitHub
Simple AI CLI that generates docs, unit tests and README.md files
☆15Mar 8, 2026Updated 4 months ago
Infini-AI-Lab / Sparrow
View on GitHub
☆16Jun 15, 2026Updated last month
pgasawa / continual-learning-bench
View on GitHub
Continual Learning Bench
☆189Jul 19, 2026Updated last week
PrimeIntellect-ai / smart-contracts
View on GitHub
Solidity contracts for the decentralized Prime Network protocol
☆26Jul 6, 2025Updated last year
benchjack / benchjack
View on GitHub
AI agent benchmark hackability scanner — find evaluation vulnerabilities before they undermine your results
☆40May 25, 2026Updated 2 months ago
ianarawjo / evalstats
View on GitHub
Statistical analysis methods for comparing prompt and model performance in LLM evaluations.
☆109Updated this week
arcee-ai / pybubble
View on GitHub
☆81Feb 18, 2026Updated 5 months ago