ππ§ Easily experiment with popular language agents across diverse reasoning/decision-making benchmarks!
β54Jul 9, 2025Updated 10 months ago
Alternatives and similar repositories for agential
Users that are interested in agential are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Evals meant to evaluate language models' ability to reason over long contexts.β10Sep 12, 2024Updated last year
- OfficeBench: Benchmarking Language Agents across Multiple Applications for Office Automationβ36Apr 1, 2026Updated last month
- Aligning Agentic World Models via Knowledgeable Experience Learningβ35May 15, 2026Updated last week
- β15Jun 30, 2025Updated 10 months ago
- Small, simple agent task environments for training and evaluationβ19Nov 1, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available β’ AdRun AI, ML, and HPC workloads on powerful cloud GPUsβwithout limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- code for "Natural Language to Code Translation with Execution"β41Nov 2, 2022Updated 3 years ago
- DSPy program/pipeline inspector widget for Jupyter/VSCode Notebooks.β45Feb 15, 2024Updated 2 years ago
- Access Jina AI news via ssh guest@news.jina.aiβ13May 3, 2024Updated 2 years ago
- LLM reads a paper and produce a working prototypeβ63Apr 12, 2025Updated last year
- β23Sep 19, 2024Updated last year
- Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)β37Dec 29, 2024Updated last year
- β32Jul 3, 2025Updated 10 months ago
- β31Jan 18, 2025Updated last year
- [ICLR 2025] SPORTU: A Comprehensive Sports Understanding Benchmark for Multimodal Large Language Modelsβ19Sep 17, 2025Updated 8 months ago
- Simple, predictable pricing with DigitalOcean hosting β’ AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Code for Columbia University COMS 3997 β LLM Ethics and Foundationsβ15Jan 7, 2025Updated last year
- β109Oct 9, 2025Updated 7 months ago
- β15Oct 24, 2022Updated 3 years ago
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)β93May 11, 2026Updated 2 weeks ago
- A Docusaurus plugin that generates a concatenated markdown file from your documentation under /llms.txtβ32Nov 15, 2024Updated last year
- Source code and data for Things not Written in Text: Exploring Spatial Commonsense from Visual Signals (ACL2022 main conference paper).β20Oct 10, 2022Updated 3 years ago
- π AppWorld: A Controllable World of Apps and People for Benchmarking Function Calling and Interactive Coding Agent, ACL'24 Best Resourceβ¦β421Feb 17, 2026Updated 3 months ago
- Training GPTs to solve interaction netsβ18Aug 14, 2024Updated last year
- Modern markdown blogging platform built with Next.js 14 and Supabase. Features rich content editing with live preview, one-click SEO optiβ¦β19May 15, 2026Updated last week
- GPU virtual machines on DigitalOcean Gradient AI β’ AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Streamlit OpenAI app to chat with custom text documents of all kindsβ13Apr 11, 2026Updated last month
- [EMNLP'24 (Main)] DRPO(Dynamic Rewarding with Prompt Optimization) is a tuning-free approach for self-alignment. DRPO leverages a search-β¦β25Nov 17, 2024Updated last year
- β15Sep 24, 2022Updated 3 years ago
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive argumentsβ102Oct 3, 2025Updated 7 months ago
- This is a meta-model distilled from LLMs for information extraction. This is an intermediate checkpoint that can be well-transferred to aβ¦β29Feb 23, 2025Updated last year
- Demonstration-Free: Towards More Practical Log Parsing with Large Language Modelsβ29Jun 17, 2025Updated 11 months ago
- Makes it easy to use altair from FastHTMLβ28Oct 9, 2024Updated last year
- A toolkit for building computer use AI agentsβ194Jun 26, 2025Updated 11 months ago
- π Logging utilities for spaCyβ12Nov 3, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits β’ AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- tiny_fnc_engine is a minimal python library that provides a flexible engine for calling functions extracted from a LLM.β38Sep 11, 2024Updated last year
- Source code and data for The Magic of IF: Investigating Causal Reasoning Abilities in Large Language Models of Code (Findings of ACL 2023β¦β31Jun 4, 2023Updated 2 years ago
- Hyperbolic Disk Embeddings for Directed Acyclic Graphs (ICML 2019)β20May 13, 2019Updated 7 years ago
- AI powered Chatbot with real time updates.β79Oct 25, 2024Updated last year
- [AST'26] LLAMAFUZZ: Large Language Model Enhanced Greybox Fuzzingβ23Dec 3, 2024Updated last year
- The 4th rank system of the SemEval 2021 Task4.β10May 7, 2022Updated 4 years ago
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Modelsβ120Apr 27, 2026Updated 3 weeks ago