"A Survey on Agent-as-a-Judge"
☆122May 11, 2026Updated last week
Alternatives and similar repositories for Awesome-Agent-as-a-Judge
Users that are interested in Awesome-Agent-as-a-Judge are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆13Jan 14, 2026Updated 4 months ago
- (ACL2025 Findings) Official code for the paper "STeCa: Step-level Trajectory Calibration for LLM Agent Learning"☆27Mar 2, 2026Updated 2 months ago
- [ACL 2026 Findings] "Omni-R1: Towards the Unified Generative Paradigm for Multimodal Reasoning"☆62Jan 28, 2026Updated 3 months ago
- ☆21Aug 9, 2024Updated last year
- ☆29Mar 10, 2026Updated 2 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆18Jun 24, 2025Updated 10 months ago
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆32Oct 9, 2025Updated 7 months ago
- [ICLR 2025] SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration☆67Feb 21, 2025Updated last year
- Code for Incorporating Relevance Feedback for Information-Seeking Retrieval using Few-Shot Document Re-Ranking, EMNLP 2022, https://aclan…☆14Mar 30, 2026Updated last month
- [ICML 2025] Closed-Loop Long-Horizon Robotic Planning via Equilibrium Sequence Modeling☆13May 5, 2025Updated last year
- Official repository Flash Local Linear Attention☆23Apr 23, 2026Updated 3 weeks ago
- THEORY OF SPACE: a benchmark for evaluating whether foundation models can actively explore under partial observability efficiently to bui…☆76Feb 27, 2026Updated 2 months ago
- Lemon Agent☆53Feb 10, 2026Updated 3 months ago
- Official code for paper "SPA-RL: Reinforcing LLM Agent via Stepwise Progress Attribution"☆84Sep 13, 2025Updated 8 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Official implementation for paper "How Far Are We from Genuinely Useful Deep Research Agents?"☆65Dec 10, 2025Updated 5 months ago
- Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning☆24Sep 9, 2024Updated last year
- Training tiny models to prove hard theorems☆77Mar 5, 2026Updated 2 months ago
- ☆16Jun 25, 2025Updated 10 months ago
- MLX Implementation of Recursive Reasoning with Tiny Networks☆78Oct 11, 2025Updated 7 months ago
- [ICML2024] Repo for the paper `Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models'☆24Jan 1, 2025Updated last year
- Official Project Page for Web World Models (https://arxiv.org/abs/2512.23676)☆89Jan 30, 2026Updated 3 months ago
- Simple and Ideal Circuit Simulation☆13Dec 4, 2017Updated 8 years ago
- ☆34Apr 11, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆10Jan 7, 2020Updated 6 years ago
- Official Implementation for NorMuon paper☆70Apr 30, 2026Updated 2 weeks ago
- ☆14Apr 30, 2025Updated last year
- Radiantloom Email Assist 7B is an email-assistant large language model fine-tuned from Zephyr-7B-Beta, over a custom-curated dataset of 1…☆14Jan 19, 2024Updated 2 years ago
- Data and codes for EMNLP 2022 paper "CDConv: A Benchmark for Contradiction Detection in Chinese Conversations"☆13May 8, 2023Updated 3 years ago
- A project for tri-modal LLM benchmarking and instruction tuning.☆60Mar 27, 2025Updated last year
- The code for paper "EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning"☆38Oct 1, 2025Updated 7 months ago
- Connect VS Code to Google Colab Runtimes☆37Nov 13, 2025Updated 6 months ago
- Training and evaluation scripts for MALA (https://arxiv.org/abs/1709.02974).☆11Jun 23, 2021Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- The repo for using the model https://huggingface.co/thu-coai/Attacker-v0.1☆13Apr 23, 2025Updated last year
- ☆11Jul 11, 2023Updated 2 years ago
- [NeurIPS D&B Track 2024] Source code for the paper "Constrained Human-AI Cooperation: An Inclusive Embodied Social Intelligence Challenge…☆25May 2, 2025Updated last year
- ☆11Mar 3, 2026Updated 2 months ago
- Official code for Guiding Language Model Math Reasoning with Planning Tokens☆19Feb 29, 2024Updated 2 years ago
- A PyTorch implementation of a conditional Denoising Diffusion Probabilistic Model (DDPM) for multi-modal trajectory prediction. This proj…☆39Feb 20, 2026Updated 2 months ago
- A buyers checklist guide for purchasing your new BYD Car☆10Feb 18, 2024Updated 2 years ago