JoelNiklaus / LEXTREME
This repository provides scripts for evaluating NLP models on the LEXTREME benchmark, a set of diverse multilingual tasks in legal NLP
☆21Updated last year
Alternatives and similar repositories for LEXTREME:
Users that are interested in LEXTREME are comparing it to the libraries listed below
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆125Updated 11 months ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆46Updated last year
- ☆54Updated 2 years ago
- Code and Data for "Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering"☆83Updated 7 months ago
- The official repository for Efficient Long-Text Understanding Using Short-Text Models (Ivgi et al., 2022) paper☆69Updated last year
- 🕸️ A graph-augmented dense statute retriever. (EACL 2023)☆21Updated last year
- This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' pu…☆40Updated 3 years ago
- StAtutory Reasoning Assessment☆13Updated 2 years ago
- Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval☆43Updated 4 months ago
- ☆38Updated 2 months ago
- The corresponding code for our paper: "Exploring the Challenges of Open Domain Multi-Document Summarization". Do not hesitate to open an …☆32Updated last year
- ☆45Updated 2 years ago
- Pretraining Efficiently on S2ORC!☆156Updated 4 months ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆56Updated 7 months ago
- Retrieval-Augmented Generation battle!☆48Updated 2 months ago
- GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings☆37Updated last year
- No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrieval☆28Updated 2 years ago
- Mining Legal Arguments in Court Decisions - Data and software☆66Updated last year
- Starbucks: Improved Training for 2D Matryoshka Embeddings☆18Updated last month
- ☆8Updated 7 months ago
- ☆97Updated 2 years ago
- IntructIR, a novel benchmark specifically designed to evaluate the instruction following ability in information retrieval models. Our foc…☆31Updated 9 months ago
- Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval (NeurIPS'21)☆44Updated 3 years ago
- Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"☆55Updated last year
- AIS is an evaluation framework for assessing whether the output of natural language models only contains information about the external w…☆31Updated 2 years ago
- Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.☆74Updated 3 years ago
- MultiEURLEX - A multi-lingual and multi-label legal document classification dataset for zero-shot cross-lingual transfer☆35Updated 2 years ago
- Zero-shot evaluation on LEXGLUE tasks with GTP3.5☆27Updated 2 years ago
- Code and model release for the paper "Task-aware Retrieval with Instructions" by Asai et al.☆161Updated last year
- This repository contains the relevant materials for the tutorial "Legal IR and NLP: the History, Challenges, and State-of-the-Art", held …☆41Updated last year