JoelNiklaus / LEXTREMELinks
This repository provides scripts for evaluating NLP models on the LEXTREME benchmark, a set of diverse multilingual tasks in legal NLP
β22Updated last year
Alternatives and similar repositories for LEXTREME
Users that are interested in LEXTREME are comparing it to the libraries listed below
Sorting:
- πΈοΈ A graph-augmented dense statute retriever. (EACL 2023)β21Updated last year
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 laβ¦β48Updated last year
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answersβ129Updated last year
- Mining Legal Arguments in Court Decisions - Data and softwareβ68Updated 2 years ago
- Retrieval-Augmented Generation battle!β52Updated 6 months ago
- β54Updated 2 years ago
- β29Updated last year
- Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval (NeurIPS'21)β45Updated 3 years ago
- β86Updated 2 months ago
- CLIR version of ColBERTβ68Updated last month
- β47Updated 3 years ago
- No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrievalβ29Updated 2 years ago
- β18Updated 10 months ago
- Code for the arXiv paper: "LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond"β59Updated 4 months ago
- πΎ Universal, customizable and deployable fine-grained evaluation for text generation.β23Updated last year
- Repository for Zheng and Guha et al., 2021, "When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataβ¦β90Updated 2 years ago
- Efficient Memory-Augmented Transformersβ34Updated 2 years ago
- Code for Relevance-guided Supervision for OpenQA with ColBERT (TACL'21)β41Updated 3 years ago
- Inquisitive Parrots for Searchβ193Updated 3 weeks ago
- Code and Data for "Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering"β84Updated 10 months ago
- INCOME: An Easy Repository for Training and Evaluation of Index Compression Methods in Dense Retrieval. Includes BPR and JPQ.β24Updated last year
- Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"β56Updated last year
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.β59Updated 10 months ago
- Repository for "Attribute First, then Generate: Locally-attributable Grounded Text Generation", ACL 2024β29Updated 6 months ago
- Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.β76Updated 3 years ago
- The official repository for Efficient Long-Text Understanding Using Short-Text Models (Ivgi et al., 2022) paperβ69Updated 2 years ago
- A Human-LLM Collaborative Dataset for Generative Information-seeking with Attributionβ33Updated last year
- Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings (EMNLP 2022 paper)β69Updated 2 years ago
- Code and dataset for the emnlp paper titled Instruct and Extract: Instruction Tuning for On-Demand Information Extractionβ52Updated last year
- MultiEURLEX - A multi-lingual and multi-label legal document classification dataset for zero-shot cross-lingual transferβ37Updated 3 years ago