neelguha/legal-ml-datasets

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/neelguha/legal-ml-datasets)

neelguha / legal-ml-datasets

A collection of datasets and tasks for legal machine learning

☆441

Alternatives and similar repositories for legal-ml-datasets

Users that are interested in legal-ml-datasets are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

HazyResearch / legalbench
View on GitHub
An open science effort to benchmark legal reasoning in foundation models
☆614Mar 30, 2026Updated 3 months ago
Breakend / PileOfLaw
View on GitHub
A dataset for pretraining language models targeted for legal tasks.
☆148Jun 30, 2022Updated 4 years ago
neelguha / legal-segmenter
View on GitHub
A simple library for segmenting legal texts
☆18Apr 22, 2023Updated 3 years ago
maastrichtlawtech / awesome-legal-nlp
View on GitHub
📖 A curated list of LegalNLP resources from all around the web.
☆334Oct 14, 2025Updated 9 months ago
openlegaldata / awesome-legal-data
View on GitHub
A collection of datasets and other resources for legal text processing.
☆281Jul 9, 2026Updated 2 weeks ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
reglab / casehold
View on GitHub
Repository for Zheng and Guha et al., 2021, "When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Data…
☆97Mar 27, 2023Updated 3 years ago
coastalcph / lex-glue
View on GitHub
LexGLUE: A Benchmark Dataset for Legal Language Understanding in English
☆266Jul 23, 2025Updated last year
Liquid-Legal-Institute / Legal-LLMs-GPTs
View on GitHub
Large Language Models (LLMs) and Generative Pre-trained Transformers (GPTs) for Legal
☆104Apr 13, 2023Updated 3 years ago
Law-AI / summarization
View on GitHub
Implementation of different summarization algorithms applied to legal case judgements.
☆224Nov 9, 2022Updated 3 years ago
Sreyan88 / DALE
View on GitHub
Code for EMNLP 2023 paper: DALE: Generative Data Augmentation for Low-Resource Legal NLP
☆11Oct 27, 2023Updated 2 years ago
The-Atticus-Project / cuad
View on GitHub
CUAD (NeurIPS 2021)
☆542Jul 13, 2023Updated 3 years ago
lauramanor / legal_summarization
View on GitHub
☆40Jul 17, 2022Updated 4 years ago
Law-AI / semantic-segmentation
View on GitHub
Semantic Segmentation of Legal texts that labels sentences with one of 7 rhetorical roles.
☆80Jun 19, 2024Updated 2 years ago
Jeryi-Sun / LLM-and-Law
View on GitHub
This repository is dedicated to summarizing papers related to large language models with the field of law
☆318Updated this week
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
LexPredict / lexpredict-lexnlp
View on GitHub
LexNLP by LexPredict
☆791May 27, 2024Updated 2 years ago
mscarey / legislice
View on GitHub
API client for fetching and comparing passages from legislation
☆14Jun 29, 2026Updated 3 weeks ago
harvard-lil / capstone
View on GitHub
CAP database scripts.
☆197Sep 10, 2024Updated last year
maastrichtlawtech / extraction_libraries
View on GitHub
Python libraries for extracting from data sources like Rechtspraak, ECHR, Cellar
☆13Jul 2, 2025Updated last year
LexPredict / lexpredict-legal-dictionary
View on GitHub
LexPredict Legal Dictionaries
☆138Aug 31, 2022Updated 3 years ago
freelawproject / eyecite
View on GitHub
Find legal citations in any block of text
☆263Updated this week
Liquid-Legal-Institute / Legal-Ontologies
View on GitHub
A list of selected resources, methods, and tools dedicated to legal data schemes and ontologies.
☆185Mar 30, 2024Updated 2 years ago
sali-legal / LMSS
View on GitHub
SALI LMSS: Legal Matter Standard Specification
☆82Mar 10, 2026Updated 4 months ago
oasis-open / legaldocml-akomantoso
View on GitHub
OASIS TC Open Repository: Schema files, examples, exemplificative implementations and libraries, and documentation related to the LegalDo…
☆84Jun 2, 2022Updated 4 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
JoelNiklaus / LawInstruct
View on GitHub
This repository is a collection of legal instruction datasets
☆28Jul 12, 2024Updated 2 years ago
ICLRandD / Blackstone
View on GitHub
A spaCy pipeline and model for NLP on unstructured legal text.
☆693Jul 16, 2024Updated 2 years ago
JustlyAI / lmss_entity_extractor
View on GitHub
Tool to apply Legal Matter Specification Standard (LMSS) to documents
☆12Aug 15, 2024Updated last year
TiltonLAW / LegalWRITER
View on GitHub
GPT-3.5-trubo + Harvard's Case Access Project
☆19Jun 6, 2023Updated 3 years ago
bcgov / lear
View on GitHub
Legal Entities and Asset Registry
☆20Updated this week
maastrichtlawtech / law3025-legal-analytics
View on GitHub
📚 Materials for Legal Analytics (LAW3025) @ Maastricht University
☆14Jan 27, 2026Updated 5 months ago
iliaschalkidis / LegalCrawler
View on GitHub
LegalCrawler: A tool for automated scraping of English legal corpora
☆64Aug 18, 2022Updated 3 years ago
bockph / Legal-Sentence-Role-Classification
View on GitHub
This repo is about the classification of rhetorical roles in Legal Documents such as: Citation, Findings of Fact, Evidence, Legal Rule, R…
☆18Feb 22, 2022Updated 4 years ago
273v / lmss-suggestion-api
View on GitHub
SALI LMSS Suggestion API
☆18Jan 5, 2024Updated 2 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
harvard-lil / olaw
View on GitHub
AI + Legal APIs: A Tool-Based Retrieval Augmented Generation Workbench for Legal AI UX Research.
☆164Oct 29, 2024Updated last year
MohammedAly22 / JudgerAI
View on GitHub
Introducing JudgerAI - the revolutionary NLP application that predicts legal judgments with stunning accuracy! Say goodbye to the guesswo…
☆26Jun 15, 2025Updated last year
unitedstates / uslaw.link
View on GitHub
A legal citation resolver.
☆82Dec 11, 2022Updated 3 years ago
SuffolkLITLab / docassemble-ALWeaver
View on GitHub
A tool to help quickly generate draft interviews from an existing document (pdf or DOCX) for the docassemble platform.
☆25Jul 7, 2026Updated 2 weeks ago
dot-legal / reference
View on GitHub
Write beautifully short contract. https://reference.legal/ is a referenceable clause library to standardize contracts once and for all.
☆13Jul 12, 2022Updated 4 years ago
harveyai / biglaw-bench
View on GitHub
☆171Mar 17, 2026Updated 4 months ago
TracyWang95 / legal-prompts-for-gpt
View on GitHub
An opensource legal prompts
☆383Mar 15, 2023Updated 3 years ago