TheAtticusProject / maud
β78Updated 2 years ago
Alternatives and similar repositories for maud:
Users that are interested in maud are comparing it to the libraries listed below
- A Python library aimed at dissecting and augmenting NER training data.β58Updated 2 years ago
- π€ Disaggregators: Curated data labelers for in-depth analysis.β66Updated 2 years ago
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.β108Updated 11 months ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.β93Updated 2 years ago
- Datasets collection and preprocessings framework for NLP extreme multitask learningβ180Updated 4 months ago
- β86Updated last month
- β47Updated last year
- π« SpaCy wrapper for ConceptNet π«β93Updated last year
- Versatile framework designed to streamline the integration of your models, as well as those sourced from Hugging Face, into complex progrβ¦β30Updated last month
- Mining Legal Arguments in Court Decisions - Data and softwareβ68Updated last year
- Code for Relevance-guided Supervision for OpenQA with ColBERT (TACL'21)β41Updated 3 years ago
- Open source library for few shot NLPβ78Updated last year
- Embedding Recycling for Language modelsβ38Updated last year
- Source code and data for Like a Good Nearest Neighborβ28Updated 4 months ago
- Legal document similarity - Code, data, and models for the ICAIL 2021 paper "Evaluating Document Representations for Content-based Legal β¦β32Updated 4 years ago
- No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrievalβ29Updated 2 years ago
- Experiments with generating opensource language model assistantsβ97Updated last year
- minimal pytorch implementation of bm25 (with sparse tensors)β101Updated last year
- β45Updated 3 years ago
- Vespa application making an index of the CORD-19 dataset.β39Updated 3 months ago
- A monolingual and cross-lingual meta-embedding generation and evaluation frameworkβ80Updated 3 years ago
- This repository provides scripts for evaluating NLP models on the LEXTREME benchmark, a set of diverse multilingual tasks in legal NLPβ22Updated last year
- A library to synthesize text datasets using Large Language Models (LLM)β152Updated 2 years ago
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answersβ128Updated last year
- β43Updated 2 years ago
- Pre-train Static Word Embeddingsβ59Updated last month
- This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' puβ¦β40Updated 3 years ago
- Inquisitive Parrots for Searchβ191Updated last year
- β97Updated 2 years ago
- Our open source implementation of MiniLMv2 (https://aclanthology.org/2021.findings-acl.188)β61Updated last year