TheAtticusProject / maud
β78Updated 2 years ago
Alternatives and similar repositories for maud:
Users that are interested in maud are comparing it to the libraries listed below
- A Python library aimed at dissecting and augmenting NER training data.β58Updated last year
- π€ Disaggregators: Curated data labelers for in-depth analysis.β65Updated 2 years ago
- π« SpaCy wrapper for ConceptNet π«β92Updated last year
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.β108Updated 11 months ago
- Versatile framework designed to streamline the integration of your models, as well as those sourced from Hugging Face, into complex progrβ¦β29Updated last week
- Datasets collection and preprocessings framework for NLP extreme multitask learningβ179Updated 3 months ago
- Repository for Zheng and Guha et al., 2021, "When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataβ¦β87Updated 2 years ago
- β47Updated last year
- Alternate Implementation for Zero Shot Text Classification: Instead of reframing NLI/XNLI, this reframes the text backbone of CLIP modelsβ¦β37Updated 3 years ago
- This repository provides scripts for evaluating NLP models on the LEXTREME benchmark, a set of diverse multilingual tasks in legal NLPβ21Updated last year
- Mining Legal Arguments in Court Decisions - Data and softwareβ67Updated last year
- Generalist and Lightweight Model for Text Classificationβ119Updated last week
- β13Updated 2 years ago
- β86Updated 2 weeks ago
- Source code and data for Like a Good Nearest Neighborβ28Updated 3 months ago
- GraphER: A Structure-aware Text-to-Graph Model for Entity and Relation Extractionβ69Updated 8 months ago
- Supervised instruction finetuning for LLM with HF trainer and Deepspeedβ34Updated last year
- A dataset for pretraining language models targeted for legal tasks.β130Updated 2 years ago
- Legal document similarity - Code, data, and models for the ICAIL 2021 paper "Evaluating Document Representations for Content-based Legal β¦β32Updated 3 years ago
- multimodal document analysisβ164Updated 10 months ago
- β33Updated 2 years ago
- A monolingual and cross-lingual meta-embedding generation and evaluation frameworkβ80Updated 2 years ago
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answersβ126Updated last year
- Seahorse is a dataset for multilingual, multi-faceted summarization evaluation. It consists of 96K summaries with human ratings along 6 qβ¦β88Updated last year
- Our open source implementation of MiniLMv2 (https://aclanthology.org/2021.findings-acl.188)β61Updated last year
- Question-answers, collected from Googleβ129Updated 3 years ago
- Experiments with generating opensource language model assistantsβ97Updated last year
- π Reference-Free automatic summarization evaluation with potential hallucination detectionβ100Updated last year
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.β93Updated 2 years ago
- π€ HuggingFace Inference Toolkit for Google Cloud Vertex AI (similar to SageMaker's Inference Toolkit, but for Vertex AI and unofficial)β17Updated last year