maastrichtlawtech / bsardLinks
π A statutory article retrieval dataset in French. (ACL 2022)
β40Updated last year
Alternatives and similar repositories for bsard
Users that are interested in bsard are comparing it to the libraries listed below
Sorting:
- GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddingsβ43Updated last year
- Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.β78Updated 3 years ago
- A multilingual version of MS MARCO passage ranking datasetβ144Updated last year
- Inquisitive Parrots for Searchβ194Updated 2 months ago
- Long Document Summarization Papersβ149Updated 2 years ago
- Dataset from the paper "Mintaka: A Complex, Natural, and Multilingual Dataset for End-to-End Question Answering" (COLING 2022)β114Updated 2 years ago
- Ensembling Hugging Face transformers made easyβ63Updated 2 years ago
- TimeLMs: Diachronic Language Models from Twitterβ109Updated last year
- β54Updated 2 years ago
- Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: β¦β337Updated 2 years ago
- Easy modernBERT fine-tuning and multi-task learningβ59Updated last month
- A multi-purpose toolkit for table-to-text generation: web interface, Python bindings, CLI commands.β55Updated last year
- This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' puβ¦β41Updated 3 years ago
- Code accompanying the submission "Structural Text Segmentation of Legal Documents" by Aumiller et al.β97Updated last year
- This repository contains the relevant materials for the tutorial "Legal IR and NLP: the History, Challenges, and State-of-the-Art", held β¦β41Updated 2 years ago
- β44Updated 2 years ago
- Search Engines with Autoregressive Language modelsβ291Updated 2 years ago
- multimodal document analysisβ165Updated last year
- β42Updated 3 years ago
- Repo for Aspire - A scientific document similarity model based on matching fine-grained aspects of scientific papers.β54Updated last year
- Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedbackβ97Updated last year
- Long-context pretrained encoder-decoder modelsβ96Updated 2 years ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 laβ¦β49Updated last year
- The official code for PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarizationβ157Updated 2 years ago
- A Python Search Engine for Humans π₯Έβ226Updated last year
- Repository for Zheng and Guha et al., 2021, "When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataβ¦β90Updated 2 years ago
- Legal document similarity - Code, data, and models for the ICAIL 2021 paper "Evaluating Document Representations for Content-based Legal β¦β32Updated 4 years ago
- The model implementations for T5 encoder decoder soft prompt tuning for text generation.β24Updated 2 years ago
- Code and model release for the paper "Task-aware Retrieval with Instructions" by Asai et al.β163Updated last year
- β78Updated 10 months ago