maastrichtlawtech / bsardLinks
π A statutory article retrieval dataset in French. (ACL 2022)
β40Updated 2 years ago
Alternatives and similar repositories for bsard
Users that are interested in bsard are comparing it to the libraries listed below
Sorting:
- A multilingual version of MS MARCO passage ranking datasetβ144Updated 2 years ago
- GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddingsβ44Updated last year
- Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: β¦β337Updated 2 years ago
- This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' puβ¦β41Updated 3 years ago
- Inquisitive Parrots for Searchβ198Updated 4 months ago
- KeyPhraseTransformer lets you quickly extract key phrases, topics, themes from your text data with T5 transformer | Keyphrase extractionβ¦β106Updated last year
- Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.β79Updated 3 years ago
- β14Updated last year
- Easy modernBERT fine-tuning and multi-task learningβ61Updated 3 months ago
- TimeLMs: Diachronic Language Models from Twitterβ111Updated last year
- A multi-purpose toolkit for table-to-text generation: web interface, Python bindings, CLI commands.β56Updated last year
- Ensembling Hugging Face transformers made easyβ62Updated 2 years ago
- β86Updated 6 months ago
- β45Updated 2 years ago
- Repo for Aspire - A scientific document similarity model based on matching fine-grained aspects of scientific papers.β54Updated 2 years ago
- A Python Search Engine for Humans π₯Έβ237Updated last year
- Repository for Zheng and Guha et al., 2021, "When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataβ¦β92Updated 2 years ago
- Code and model release for the paper "Task-aware Retrieval with Instructions" by Asai et al.β165Updated 2 years ago
- The official code for PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarizationβ156Updated 2 years ago
- β54Updated 2 years ago
- Multi-task modelling extensions for huggingface transformersβ21Updated 2 years ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 laβ¦β49Updated last year
- Code accompanying the submission "Structural Text Segmentation of Legal Documents" by Aumiller et al.β98Updated 2 years ago
- Dataset from the paper "Mintaka: A Complex, Natural, and Multilingual Dataset for End-to-End Question Answering" (COLING 2022)β114Updated 2 years ago
- Search Engines with Autoregressive Language modelsβ292Updated 2 years ago
- multimodal document analysisβ167Updated last year
- β79Updated last year
- The autoregressive information extraction system GenIE (Generative Information Extraction) implemented in PyTorch.β104Updated 2 years ago
- β29Updated last year
- This repository contains materials for the SIGIR 2022 tutorial on opinion summarization.β34Updated 3 years ago