JoelNiklaus / LegalDatasetsLinks
This repository serves as a collection of scrapers procuring and structuring various legal datasets
β17Updated 2 years ago
Alternatives and similar repositories for LegalDatasets
Users that are interested in LegalDatasets are comparing it to the libraries listed below
Sorting:
- Writing Blog Posts with Generative Feedback Loops!β49Updated last year
- Explore the use of DSPy for extracting features from PDFs πβ43Updated last year
- Docutron Toolkit: detection and segmentation analysis for legal data extraction over documents.β26Updated last year
- Streamlit app for recommending eval functions using prompt diffsβ28Updated last year
- Large Language Models (LLMs) and Generative Pre-trained Transformers (GPTs) for Legalβ92Updated 2 years ago
- Dataset Viber is your chill repo for data collection, annotation and vibe checks.β47Updated 10 months ago
- A dataset for pretraining language models targeted for legal tasks.β134Updated 3 years ago
- A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GROβ¦β51Updated 3 months ago
- π Unstructured Data Connectors for Haystack 2.0β17Updated last year
- Repository of the code base for KT Generation process that we worked at Google Cloud and Searce GenAI Hackathon.β74Updated last year
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absoluteβ¦β49Updated last year
- Text to Python Objects via a LLM Function Callβ58Updated last year
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing β‘β66Updated 8 months ago
- A multimodal RAG application that enables semantic search on multimedia sources like audio, video and imagesβ40Updated last year
- Source codes for the paper "Bounding the Capabilities of Large Language Models in Open Text Generation with Prompt Constraints"β27Updated 2 years ago
- Codebase accompanying the Summary of a Haystack paper.β79Updated 9 months ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)β75Updated 8 months ago
- Fullstack chatbot applicationβ11Updated last week
- Tool to apply Legal Matter Specification Standard (LMSS) to documentsβ13Updated 11 months ago
- Mixing Language Models with Self-Verification and Meta-Verificationβ106Updated 7 months ago
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafteβ¦β73Updated 8 months ago
- Trained BERT and Word2Vec legal clause classifiers for SPACY using the Atticus Project's Open Source Contract Label Corpusβ14Updated 4 years ago
- Code and Dataset for Learning to Solve Complex Tasks by Talking to Agentsβ24Updated 3 years ago
- AI_Powered_Dev_Search_Engineβ12Updated last year
- [SIGIR 2024 (Demo)] CoSearchAgent: A Lightweight Collborative Search Agent with Large Language Modelsβ27Updated last year
- β75Updated last year
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Modelsβ108Updated 3 months ago
- Lightweight Non-Parametric Embedding Fine-Tuningβ25Updated 9 months ago
- Using open source LLMs to build synthetic datasets for direct preference optimizationβ64Updated last year
- A tutorial on DSPy and whether automated prompt engineering lives up to the hypeβ23Updated last year