metauto-ai / agent-as-a-judge
π€ Agent-as-a-Judge and DevAI dataset
β184Updated last week
Related projects β
Alternatives and complementary repositories for agent-as-a-judge
- Environments, tools, and benchmarks for general computer agentsβ172Updated 2 weeks ago
- Building Open LLM Web Agents with Self-Evolving Online Curriculum RLβ166Updated this week
- FireAct: Toward Language Agent Fine-tuningβ254Updated last year
- AWM: Agent Workflow Memoryβ203Updated last month
- β283Updated last month
- β116Updated 5 months ago
- Official repo for "LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs".β189Updated 2 months ago
- Expert Specialized Fine-Tuningβ143Updated last month
- β280Updated 7 months ago
- β192Updated 6 months ago
- Offical Repo for "Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale"β190Updated 3 weeks ago
- Code for Husky, an open-source language agent that solves complex, multi-step reasoning tasks. Husky v1 addresses numerical, tabular and β¦β328Updated 4 months ago
- β102Updated 2 months ago
- This is the official repo for "PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization". PromptAgenβ¦β199Updated 3 months ago
- An Analytical Evaluation Board of Multi-turn LLM Agentsβ245Updated 5 months ago
- KnowAgent: Knowledge-Augmented Planning for LLM-Based Agentsβ171Updated 3 weeks ago
- Code and implementations for the paper "AgentGym: Evolving Large Language Model-based Agents across Diverse Environments" by Zhiheng Xi eβ¦β346Updated 2 months ago
- An implemtation of Everyting of Thoughts (XoT).β129Updated 8 months ago
- LongEmbed: Extending Embedding Models for Long Context Retrieval (EMNLP 2024)β114Updated this week
- OS-ATLAS: A Foundation Action Model For Generalist GUI Agentsβ133Updated this week
- This repository contains the paper list for the paper: Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoβ¦β339Updated 11 months ago
- [ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancementβ153Updated 7 months ago
- β152Updated 2 months ago
- Official implementation of "DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning" in ICML'24β126Updated this week
- Reformatted Alignmentβ112Updated last month
- Official repo for "Make Your LLM Fully Utilize the Context"β241Updated 5 months ago
- β128Updated last week
- ControlLLM: Augment Language Models with Tools by Searching on Graphsβ186Updated 3 months ago
- β223Updated this week
- β190Updated 2 months ago