A collection of datasets and tasks for legal machine learning
☆429Jan 4, 2026Updated 2 months ago
Alternatives and similar repositories for legal-ml-datasets
Users that are interested in legal-ml-datasets are comparing it to the libraries listed below
Sorting:
- An open science effort to benchmark legal reasoning in foundation models☆546Aug 25, 2024Updated last year
- A simple library for segmenting legal texts☆17Apr 22, 2023Updated 2 years ago
- A dataset for pretraining language models targeted for legal tasks.☆143Jun 30, 2022Updated 3 years ago
- A collection of datasets and other resources for legal text processing.☆190Oct 20, 2025Updated 4 months ago
- 📖 A curated list of LegalNLP resources from all around the web.☆304Oct 14, 2025Updated 4 months ago
- NLP Web API for Legal Text☆18Dec 23, 2022Updated 3 years ago
- Repository for Zheng and Guha et al., 2021, "When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Data…☆95Mar 27, 2023Updated 2 years ago
- A list of selected resources, methods, and tools dedicated to Legal Text Analytics.☆699Nov 5, 2024Updated last year
- API client for fetching and comparing passages from legislation☆14Jan 26, 2025Updated last year
- LexGLUE: A Benchmark Dataset for Legal Language Understanding in English☆241Jul 23, 2025Updated 7 months ago
- Implementation of different summarization algorithms applied to legal case judgements.☆218Nov 9, 2022Updated 3 years ago
- Large Language Models (LLMs) and Generative Pre-trained Transformers (GPTs) for Legal☆100Apr 13, 2023Updated 2 years ago
- SALI LMSS: Legal Matter Standard Specification☆75Mar 24, 2025Updated 11 months ago
- Find legal citations in any block of text☆211Oct 3, 2025Updated 5 months ago
- LegalCrawler: A tool for automated scraping of English legal corpora☆61Aug 18, 2022Updated 3 years ago
- CUAD (NeurIPS 2021)☆478Jul 13, 2023Updated 2 years ago
- Tool to apply Legal Matter Specification Standard (LMSS) to documents☆12Aug 15, 2024Updated last year
- Code for EMNLP 2023 paper: DALE: Generative Data Augmentation for Low-Resource Legal NLP☆10Oct 27, 2023Updated 2 years ago
- This repository is dedicated to summarizing papers related to large language models with the field of law☆285Jan 15, 2026Updated last month
- Semantic Segmentation of Legal texts that labels sentences with one of 7 rhetorical roles.☆78Jun 19, 2024Updated last year
- LexNLP by LexPredict☆767May 27, 2024Updated last year
- KL3M training data collection and preprocessing☆20Apr 14, 2025Updated 10 months ago
- LexPredict Legal Dictionaries☆132Aug 31, 2022Updated 3 years ago
- ☆114Oct 8, 2025Updated 5 months ago
- Instant redline with AI summary☆38Dec 7, 2025Updated 3 months ago
- A spaCy pipeline and model for NLP on unstructured legal text.☆676Jul 16, 2024Updated last year
- SALI LMSS Suggestion API☆18Jan 5, 2024Updated 2 years ago
- This repo is about the classification of rhetorical roles in Legal Documents such as: Citation, Findings of Fact, Evidence, Legal Rule, R…☆16Feb 22, 2022Updated 4 years ago
- A list of selected resources, methods, and tools dedicated to legal data schemes and ontologies.☆152Mar 30, 2024Updated last year
- CAP database scripts.☆195Sep 10, 2024Updated last year
- A database of court reporters, tests and other experiments☆124Feb 9, 2026Updated last month
- GPT-3.5-trubo + Harvard's Case Access Project☆18Jun 6, 2023Updated 2 years ago
- Python libraries for extracting from data sources like Rechtspraak, ECHR, Cellar☆13Jul 2, 2025Updated 8 months ago
- 📚 Materials for Legal Analytics (LAW3025) @ Maastricht University☆13Jan 27, 2026Updated last month
- Trained BERT and Word2Vec legal clause classifiers for SPACY using the Atticus Project's Open Source Contract Label Corpus☆13Jan 2, 2021Updated 5 years ago
- LeXFiles and LegalLAMA: Facilitating English Multinational Legal Language Model Development☆21Jul 24, 2023Updated 2 years ago
- Quickly go from a paper court form to a runnable, guided, step-by-step web application powered by Docassemble. Swap out branding and pre-…☆55Updated this week
- A fully-searchable and accessible archive of court data including growing repositories of opinions, oral arguments, judges, judicial fina…☆860Mar 4, 2026Updated last week
- Data and additional information regarding the paper: Contract Discovery. Dataset and a Few-Shot Semantic Retrieval Challenge with Competi…☆32Nov 12, 2020Updated 5 years ago