maastrichtlawtech / bsardLinks
π A statutory article retrieval dataset in French. (ACL 2022)
β40Updated 2 years ago
Alternatives and similar repositories for bsard
Users that are interested in bsard are comparing it to the libraries listed below
Sorting:
- GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddingsβ44Updated last year
- Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.β79Updated 3 years ago
- Inquisitive Parrots for Searchβ198Updated 5 months ago
- A multilingual version of MS MARCO passage ranking datasetβ144Updated 2 years ago
- KeyPhraseTransformer lets you quickly extract key phrases, topics, themes from your text data with T5 transformer | Keyphrase extractionβ¦β106Updated last year
- This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' puβ¦β41Updated 3 years ago
- Code accompanying the submission "Structural Text Segmentation of Legal Documents" by Aumiller et al.β98Updated 2 years ago
- Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: β¦β338Updated 2 years ago
- Ensembling Hugging Face transformers made easyβ62Updated 2 years ago
- TimeLMs: Diachronic Language Models from Twitterβ111Updated last year
- LexGLUE: A Benchmark Dataset for Legal Language Understanding in Englishβ229Updated 4 months ago
- A multi-purpose toolkit for table-to-text generation: web interface, Python bindings, CLI commands.β57Updated last year
- β45Updated 2 years ago
- Long Document Summarization Papersβ154Updated 2 years ago
- No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrievalβ29Updated 3 years ago
- β54Updated 2 years ago
- The autoregressive information extraction system GenIE (Generative Information Extraction) implemented in PyTorch.β104Updated 2 years ago
- Dense hybrid representations for text retrievalβ63Updated 2 years ago
- β87Updated 7 months ago
- Dataset from the paper "Mintaka: A Complex, Natural, and Multilingual Dataset for End-to-End Question Answering" (COLING 2022)β116Updated 3 years ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 laβ¦β49Updated 2 years ago
- β29Updated last year
- Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillationβ113Updated 4 years ago
- This repository contains the relevant materials for the tutorial "Legal IR and NLP: the History, Challenges, and State-of-the-Art", held β¦β41Updated 2 years ago
- This repository contains materials for the SIGIR 2022 tutorial on opinion summarization.β33Updated 3 years ago
- β37Updated 3 weeks ago
- β18Updated last year
- The official code for PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization