The official repo for our paper: LegalAgentBench: Evaluating LLM Agents in Legal Domainl
☆46Apr 10, 2026Updated 2 months ago
Alternatives and similar repositories for LegalAgentBench
Users that are interested in LegalAgentBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Enhancing Legal Case Retrieval via Scaling High-quality Synthetic Query-Candidate Pairs (EMNLP 2024)☆16Nov 17, 2024Updated last year
- StaRD: Statute Retrieval Dataset based on Real-World Legal Consultation☆22Apr 24, 2025Updated last year
- Official repository of the video reasoning benchmark MMR-V. Can Your MLLMs "Think with Video"? [ICLR26]☆40Jun 23, 2025Updated 11 months ago
- Code for JuDGE, SIGIR 2025 Long Paper☆35Aug 7, 2025Updated 10 months ago
- LexEval: A Comprehensive Benchmark for Evaluating Large Language Models in Legal Domain☆98Oct 30, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- DICE: Detecting In-distribution Data Contamination with LLM's Internal State☆12Sep 21, 2024Updated last year
- A framework for evaluating RAG pipelines, specifically adapted for the legal domain.☆77Jul 28, 2025Updated 10 months ago
- A Survey of Multimodal Retrieval-Augmented Generation☆20Nov 3, 2025Updated 7 months ago
- A general framework used on evaluating the performance of large language models (LLMs) based on the peer review mechanism among LLMs☆19Aug 3, 2024Updated last year
- ☆18Jun 3, 2024Updated 2 years ago
- Benchmarking Retrieval-Augmented Generation in Multi-Turn Legal Consultation Conversation☆46Mar 3, 2025Updated last year
- CS294/194-196 Large Language Model Agents☆48Dec 20, 2024Updated last year
- [NeurIPS 2024 poster] Cross-model Control: Improving Multiple Large Language Models in One-time Training☆14Oct 25, 2024Updated last year
- [COLING 2025] NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models☆18Jan 18, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Repository for the paper: "Using deep learning to predict outcomes of legal appeals better than human experts"☆11Aug 1, 2022Updated 3 years ago
- ☆14May 9, 2024Updated 2 years ago
- Code for the paper "A Comprehensive Evaluation of Large Language Models on Legal Judgment Prediction"☆12Oct 20, 2023Updated 2 years ago
- Official code space for "SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software Development"☆60Oct 24, 2025Updated 7 months ago
- AutoCoA (Automatic generation of Chain-of-Action) is an agent model framework that enhances the multi-turn tool usage capability of reaso…☆132Mar 18, 2025Updated last year
- Code repo for FaStfact: Faster, Stronger Long-Form Factuality Evaluations in LLMs.☆31Nov 5, 2025Updated 7 months ago
- ☆16Dec 17, 2023Updated 2 years ago
- SimKO: Simple Pass@K Policy Optimization☆30Oct 24, 2025Updated 7 months ago
- Test-time compute in information retrieval☆58Jul 8, 2025Updated 11 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- code for Modeling Dynamic Pairwise Attention for Crime Classification over Legal Articles, SIGIR 2018☆12Jan 4, 2019Updated 7 years ago
- Automatically Update LLM-Agent Papers Daily using Github Actions (Update Every 12th hours)☆20Jun 1, 2026Updated last week
- ☆13Sep 26, 2024Updated last year
- Grouping and Recognize speaker from an animation video. 从动漫中提取每一个说话人。☆13May 8, 2024Updated 2 years ago
- [SIGIR '25] This is the code repo for our SIGIR '25 paper: Enhancing the Patent Matching Capability of Large Language Models via Memory G…☆19Apr 22, 2025Updated last year
- ☆14May 20, 2022Updated 4 years ago
- Explanation of the llama2 repo.☆12Jul 18, 2024Updated last year
- Code for ICCV2025 paper——IDEATOR: Jailbreaking and Benchmarking Large Vision-Language Models Using Themselves☆17Jul 11, 2025Updated 11 months ago
- Official codebase for NeurIPS 2022 paper End-to-end Learning to Index and Search in Large Output Spaces☆12Apr 19, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆13Jan 21, 2024Updated 2 years ago
- Universal LLM security auditor with automated jailbreak testing, DSPy optimization, and OWASP 2025-aligned attack patterns☆21Oct 23, 2025Updated 7 months ago
- IMPACT: A Large-scale Integrated Multimodal Patent Analysis and Creation Dataset for Design Patents (NeurIPS 2024)☆18Jul 14, 2025Updated 10 months ago
- This repository is a collection of legal instruction datasets☆28Jul 12, 2024Updated last year
- [ICLR 2025] "GraphEval: A Lightweight Graph-Based LLM Framework for Idea Evaluation", Tao Feng, Yihang Sun, Jiaxuan You☆18Mar 18, 2025Updated last year
- ☆146May 26, 2026Updated 2 weeks ago
- ☆28Oct 14, 2024Updated last year