The official repo for our paper: LegalAgentBench: Evaluating LLM Agents in Legal Domainl
☆44Apr 10, 2026Updated last month
Alternatives and similar repositories for LegalAgentBench
Users that are interested in LegalAgentBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Repository of paper "Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis" (ACL 2025 Main)☆19Jul 19, 2025Updated 10 months ago
- Enhancing Legal Case Retrieval via Scaling High-quality Synthetic Query-Candidate Pairs (EMNLP 2024)☆16Nov 17, 2024Updated last year
- StaRD: Statute Retrieval Dataset based on Real-World Legal Consultation☆22Apr 24, 2025Updated last year
- Code for JuDGE, SIGIR 2025 Long Paper☆34Aug 7, 2025Updated 9 months ago
- LexEval: A Comprehensive Benchmark for Evaluating Large Language Models in Legal Domain☆95Oct 30, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- DICE: Detecting In-distribution Data Contamination with LLM's Internal State☆11Sep 21, 2024Updated last year
- A framework for evaluating RAG pipelines, specifically adapted for the legal domain.☆76Jul 28, 2025Updated 9 months ago
- A Survey of Multimodal Retrieval-Augmented Generation☆20Nov 3, 2025Updated 6 months ago
- A general framework used on evaluating the performance of large language models (LLMs) based on the peer review mechanism among LLMs☆19Aug 3, 2024Updated last year
- Benchmarking Retrieval-Augmented Generation in Multi-Turn Legal Consultation Conversation☆44Mar 3, 2025Updated last year
- [COLING 2025] NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models☆18Jan 18, 2025Updated last year
- Repository for the paper: "Using deep learning to predict outcomes of legal appeals better than human experts"☆10Aug 1, 2022Updated 3 years ago
- ☆12Jan 7, 2020Updated 6 years ago
- ☆13Aug 12, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Code repo for FaStfact: Faster, Stronger Long-Form Factuality Evaluations in LLMs.☆32Nov 5, 2025Updated 6 months ago
- ☆16Dec 17, 2023Updated 2 years ago
- A script to draw attention heat map with matplotlib☆14May 7, 2019Updated 7 years ago
- Test-time compute in information retrieval☆57Jul 8, 2025Updated 10 months ago
- SimKO: Simple Pass@K Policy Optimization☆30Oct 24, 2025Updated 6 months ago
- Grouping and Recognize speaker from an animation video. 从动漫中提取每一个说话人。☆13May 8, 2024Updated 2 years ago
- [SIGIR '25] This is the code repo for our SIGIR '25 paper: Enhancing the Patent Matching Capability of Large Language Models via Memory G…☆19Apr 22, 2025Updated last year
- ☆14May 20, 2022Updated 4 years ago
- Explanation of the llama2 repo.☆12Jul 18, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Confidence Regulation Neurons in Language Models (NeurIPS 2024)☆15Feb 1, 2025Updated last year
- Code and Data for "FaithfulRAG: Fact-Level Conflict Modeling for Context-Faithful Retrieval-Augmented Generation" (ACL25)☆33Oct 26, 2025Updated 6 months ago
- This repository is a collection of legal instruction datasets☆27Jul 12, 2024Updated last year
- ☆12Jan 21, 2024Updated 2 years ago
- IMPACT: A Large-scale Integrated Multimodal Patent Analysis and Creation Dataset for Design Patents (NeurIPS 2024)☆18Jul 14, 2025Updated 10 months ago
- [ICLR 2025] "GraphEval: A Lightweight Graph-Based LLM Framework for Idea Evaluation", Tao Feng, Yihang Sun, Jiaxuan You☆18Mar 18, 2025Updated last year
- 🤖 A multilingual translation tool that automatically converts Hugging Face's daily AI research papers into 🇯🇵 Japanese, 🇰🇷 Korean, �…☆18Updated this week
- ☆12Jul 21, 2025Updated 10 months ago
- Reasoning-based Evaluation and Ranking of Translations.☆20Jul 18, 2025Updated 10 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Experimental tl;dr summaries for datasets on the Hugging Face Hub!☆10Apr 4, 2024Updated 2 years ago
- (ACL 2025) 🔥🔥🔥Code for "Empowering Multimodal Large Language Models with Evol-Instruct"☆22May 15, 2025Updated last year
- Data and code for <Precedent-Enhanced Legal Judgment Prediction with LLM and Domain-Model Collaboration>, will be updated soon.☆15Mar 21, 2024Updated 2 years ago
- 强化学习课程,主要是如何用强化学习解决问题☆15Dec 10, 2024Updated last year
- ☆12Jul 4, 2025Updated 10 months ago
- official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and…☆72Apr 2, 2025Updated last year
- [EuroSys'25] Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization☆23Apr 13, 2026Updated last month