CSHaitao/LegalAgentBench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/CSHaitao/LegalAgentBench)

CSHaitao / LegalAgentBench

The official repo for our paper: LegalAgentBench: Evaluating LLM Agents in Legal Domainl

☆49

Alternatives and similar repositories for LegalAgentBench

Users that are interested in LegalAgentBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

oneal2000 / JuDGE
View on GitHub
Code for JuDGE, SIGIR 2025 Long Paper
☆35Aug 7, 2025Updated 11 months ago
CSHaitao / LexEval
View on GitHub
LexEval: A Comprehensive Benchmark for Evaluating Large Language Models in Legal Domain
☆102Oct 30, 2024Updated last year
THUlawtech / JUREX
View on GitHub
JUREX-4E: Juridical Expert-Annotated Four-Element Knowledge Base for Legal Reasoning
☆15Oct 27, 2025Updated 8 months ago
CSHaitao / LexRAG
View on GitHub
Benchmarking Retrieval-Augmented Generation in Multi-Turn Legal Consultation Conversation
☆49Mar 3, 2025Updated last year
MetaGLM / qingyan-cookbook
View on GitHub
Examples for QinYan GLMs
☆13Sep 3, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
relic-yuexi / AgentCourt
View on GitHub
☆96Sep 5, 2024Updated last year
hoorangyee / LRAGE
View on GitHub
A framework for evaluating RAG pipelines, specifically adapted for the legal domain.
☆77Jul 3, 2026Updated 2 weeks ago
CSHaitao / CaseGen
View on GitHub
A Benchmark for Multi-Stage Legal Case Documents Generation
☆22Feb 24, 2025Updated last year
THU-KEG / DICE
View on GitHub
DICE: Detecting In-distribution Data Contamination with LLM's Internal State
☆12Sep 21, 2024Updated last year
Tizzzzy / Law_LLM
View on GitHub
☆31Oct 19, 2024Updated last year
PanguIR / MRAGSurvey
View on GitHub
A Survey of Multimodal Retrieval-Augmented Generation
☆20Nov 3, 2025Updated 8 months ago
chuzhumin98 / PRE
View on GitHub
A general framework used on evaluating the performance of large language models (LLMs) based on the peer review mechanism among LLMs
☆19Aug 3, 2024Updated last year
cjj826 / GoalAct
View on GitHub
The repo for our paper: Enhancing LLM-Based Agents via Global Planning and Hierarchical Execution (NCIIP 2025 Best Paper)
☆17Aug 18, 2025Updated 11 months ago
bpwu1 / confidence-regulation-neurons
View on GitHub
Confidence Regulation Neurons in Language Models (NeurIPS 2024)
☆15Feb 1, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
francescortu / comp-mech
View on GitHub
Competition of Mechanisms: Tracing How Language Models Handle Facts and Counterfactuals; ACL 2024
☆13May 24, 2024Updated 2 years ago
XMUDeepLIT / Faithful-RAG
View on GitHub
Code and Data for "FaithfulRAG: Fact-Level Conflict Modeling for Context-Faithful Retrieval-Augmented Generation" (ACL25)
☆39Oct 26, 2025Updated 8 months ago
CLR-Lab / SimKO
View on GitHub
SimKO: Simple Pass@K Policy Optimization
☆31Oct 24, 2025Updated 8 months ago
zRzRzRzRzRzRzR / lm-fly
View on GitHub
大模型推理框架加速，让 LLM 飞起来
☆24May 10, 2024Updated 2 years ago
wujwyi / CMC
View on GitHub
[NeurIPS 2024 poster] Cross-model Control: Improving Multiple Large Language Models in One-time Training
☆14Oct 25, 2024Updated last year
THU-KEG / DeepPrune
View on GitHub
🌿 DeepPrune: Parallel Scaling without Inter-trace Redundancy
☆21Apr 20, 2026Updated 3 months ago
2404589803 / hf_downloader
View on GitHub
🤗 HF Downloader (Hugging Face Downloader) 📦 A user-friendly GUI tool for downloading Hugging Face resources with enhanced connectivity…
☆13Jan 5, 2025Updated last year
cxcscmu / deepresearch_benchmarking
View on GitHub
☆29Mar 10, 2026Updated 4 months ago
huiwy / reflection-on-trees
View on GitHub
☆14May 9, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
eliasjacob / paper_brcad5
View on GitHub
Repository for the paper: "Using deep learning to predict outcomes of legal appeals better than human experts"
☆11Aug 1, 2022Updated 3 years ago
Huenao / Debate-Augmented-RAG
View on GitHub
[ACL 2025] Removal of Hallucination on Hallucination: Debate-Augmented RAG
☆44Aug 4, 2025Updated 11 months ago
LaVi-Lab / LongContextReasoner
View on GitHub
[ACL 2024] Making Long-Context Language Models Better Multi-Hop Reasoners
☆20May 28, 2024Updated 2 years ago
saintzema / legal-ai-agent
View on GitHub
An AI solution that interprets legal documents such as Contract Review, Legal Research, Risk Assessment, Compliance Check built with pyth…
☆15Dec 23, 2024Updated last year
HKAIR-Lab / HK-O1aw
View on GitHub
☆43Nov 1, 2024Updated last year
gouki510 / Topology_of_Reasoning
View on GitHub
☆42Jun 11, 2025Updated last year
nishiwen1214 / Benchmark-leakage-detection
View on GitHub
Official completion of “Training on the Benchmark Is Not All You Need”.
☆40Dec 31, 2024Updated last year
happywwy / RuleFusionForIE
View on GitHub
☆12Jan 7, 2020Updated 6 years ago
huawei-lin / RapidIn
View on GitHub
RapidIn: Scalable Influence Estimation for Large Language Models (LLMs). The implementation for paper "Token-wise Influential Training Da…
☆22Mar 10, 2026Updated 4 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
abehou / CLERC
View on GitHub
Code repo for CLERC: A Legal Precedent Dataset for Case Retrieval and Retrieval-Augmented Analysis Generation (NAACL 2025)
☆28Jan 28, 2025Updated last year
Zhang-Yihao / Adversarial-Representation-Engineering
View on GitHub
Official implementation repository for the paper Towards General Conceptual Model Editing via Adversarial Representation Engineering.
☆20Dec 6, 2024Updated last year
hhan1018 / NesTools
View on GitHub
[COLING 2025] NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models
☆18Jan 18, 2025Updated last year
Starsshine21 / RL100
View on GitHub
☆21Jun 22, 2026Updated last month
Strong-AI-Lab / ChatLogic
View on GitHub
☆16Dec 17, 2023Updated 2 years ago
razvan404 / multimodal-speech-emotion-recognition
View on GitHub
Multimodal SER Model meant to be trained on recognising emotions from speech (text + acoustic data). Fine-tuned the DeBERTaV3 model, resp…
☆11Jun 19, 2024Updated 2 years ago
ADaM-BJTU / AutoCoA
View on GitHub
AutoCoA (Automatic generation of Chain-of-Action) is an agent model framework that enhances the multi-turn tool usage capability of reaso…
☆132Mar 18, 2025Updated last year