"A Survey on Agent-as-a-Judge"
☆127May 11, 2026Updated 3 weeks ago
Alternatives and similar repositories for Awesome-Agent-as-a-Judge
Users that are interested in Awesome-Agent-as-a-Judge are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆13Jan 14, 2026Updated 4 months ago
- ☆21Aug 9, 2024Updated last year
- ☆29Mar 10, 2026Updated 2 months ago
- ☆18Jun 24, 2025Updated 11 months ago
- The project tries to solve a speaker diarization problem using audio features, face recognition and video feature extraction from face im…☆15Feb 10, 2019Updated 7 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Code for Incorporating Relevance Feedback for Information-Seeking Retrieval using Few-Shot Document Re-Ranking, EMNLP 2022, https://aclan…☆14Mar 30, 2026Updated 2 months ago
- [ICML 2025] Closed-Loop Long-Horizon Robotic Planning via Equilibrium Sequence Modeling☆13May 5, 2025Updated last year
- Introductory Python course for computational lingustics☆26Aug 20, 2024Updated last year
- Official repository Flash Local Linear Attention☆36May 28, 2026Updated last week
- THEORY OF SPACE: a benchmark for evaluating whether foundation models can actively explore under partial observability efficiently to bui…☆80Feb 27, 2026Updated 3 months ago
- Official code for paper "SPA-RL: Reinforcing LLM Agent via Stepwise Progress Attribution"☆86Sep 13, 2025Updated 8 months ago
- Evaluation code of ASE24 accepted paper "On the Evaluation of LLM in Unit Test Generation"☆13Dec 9, 2024Updated last year
- [ICML2024] Repo for the paper `Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models'☆24Jan 1, 2025Updated last year
- ☆13Sep 12, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Simple and Ideal Circuit Simulation☆13Dec 4, 2017Updated 8 years ago
- ☆13May 21, 2024Updated 2 years ago
- ☆14Apr 30, 2025Updated last year
- ☆20Oct 13, 2020Updated 5 years ago
- The code for paper "EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning"☆39Oct 1, 2025Updated 8 months ago
- ☆13Nov 20, 2024Updated last year
- Connect VS Code to Google Colab Runtimes☆37Nov 13, 2025Updated 6 months ago
- LEMMA: Logical Engine for Multi-domain Mathematical Analysis☆28Feb 14, 2026Updated 3 months ago
- [NeurIPS D&B Track 2024] Source code for the paper "Constrained Human-AI Cooperation: An Inclusive Embodied Social Intelligence Challenge…☆25May 2, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆11Mar 3, 2026Updated 3 months ago
- Official code for Guiding Language Model Math Reasoning with Planning Tokens☆19Feb 29, 2024Updated 2 years ago
- A PyTorch implementation of a conditional Denoising Diffusion Probabilistic Model (DDPM) for multi-modal trajectory prediction. This proj…☆39Feb 20, 2026Updated 3 months ago
- ☆14Mar 26, 2024Updated 2 years ago
- Reflect-RL: Two-Player Online RL Fine-Tuning for LMs☆18Jul 19, 2025Updated 10 months ago
- ☆11Jun 21, 2025Updated 11 months ago
- ☆14Nov 19, 2024Updated last year
- No More Manual Tests? Evaluating and Improving ChatGPT for Unit Test Generation☆19Jun 28, 2023Updated 2 years ago
- Official Repo for "EcoGym: Evaluating LLMs for Long-Horizon Plan-and-Execute in Interactive Economies"☆95Mar 18, 2026Updated 2 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- [NeurIPS 2025] Reasoning Models Better Express Their Confidence"☆23Nov 19, 2025Updated 6 months ago
- REAP expert pruning for MoE LLMs on Apple Silicon via MLX☆57Mar 16, 2026Updated 2 months ago
- [ACL 2026 Oral] From Word to World: Can Large Language Models be Implicit Text-based World Models?☆62Apr 13, 2026Updated last month
- This project is the official implementation of our ACM MM 2024 paper, OmniStitch: Depth-aware Stitching Framework for Omnidirectional Vis…☆18Aug 5, 2024Updated last year
- Project for a Computer Security class based on CSAW capture the flag challenges☆13Mar 19, 2014Updated 12 years ago
- A lightweight graphics library for the Elm programming language☆15Jul 15, 2017Updated 8 years ago
- ☆41Feb 14, 2026Updated 3 months ago