JoelNiklaus / LEXTREMELinks
This repository provides scripts for evaluating NLP models on the LEXTREME benchmark, a set of diverse multilingual tasks in legal NLP
☆22Updated last year
Alternatives and similar repositories for LEXTREME
Users that are interested in LEXTREME are comparing it to the libraries listed below
Sorting:
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆48Updated last year
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆128Updated last year
- 🕸️ A graph-augmented dense statute retriever. (EACL 2023)☆21Updated last year
- Mining Legal Arguments in Court Decisions - Data and software☆68Updated 2 years ago
- ☆29Updated last year
- This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' pu…☆40Updated 3 years ago
- ☆54Updated 2 years ago
- No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrieval☆29Updated 2 years ago
- ☆28Updated last year
- ☆98Updated 2 years ago
- Retrieval-Augmented Generation battle!☆51Updated 5 months ago
- ☆86Updated 2 months ago
- ☆47Updated 3 years ago
- GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings☆43Updated last year
- CLIR version of ColBERT☆67Updated last month
- Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"☆56Updated last year
- The codebase for our ACL2023 paper: Did You Read the Instructions? Rethinking the Effectiveness of Task Definitions in Instruction Learni…☆29Updated last year
- Repository for Zheng and Guha et al., 2021, "When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Data…☆90Updated 2 years ago
- StAtutory Reasoning Assessment☆13Updated 2 years ago
- Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval (NeurIPS'21)☆44Updated 3 years ago
- Pretraining Efficiently on S2ORC!☆164Updated 7 months ago
- Automatically detect errors in annotated corpora.☆47Updated last year
- Dense hybrid representations for text retrieval☆62Updated 2 years ago
- ☆18Updated 9 months ago
- Code and pre-trained models for "ReasonBert: Pre-trained to Reason with Distant Supervision", EMNLP'2021☆29Updated 2 years ago
- SPRINT Toolkit helps you evaluate diverse neural sparse models easily using a single click on any IR dataset.☆45Updated last year
- A Human-LLM Collaborative Dataset for Generative Information-seeking with Attribution☆31Updated last year
- This repository contains the relevant materials for the tutorial "Legal IR and NLP: the History, Challenges, and State-of-the-Art", held …☆41Updated 2 years ago
- Dataset from the paper "Mintaka: A Complex, Natural, and Multilingual Dataset for End-to-End Question Answering" (COLING 2022)☆113Updated 2 years ago
- Starbucks: Improved Training for 2D Matryoshka Embeddings☆20Updated 4 months ago