maastrichtlawtech / bsard
π A statutory article retrieval dataset in French. (ACL 2022)
β39Updated last year
Alternatives and similar repositories for bsard
Users that are interested in bsard are comparing it to the libraries listed below
Sorting:
- A multilingual version of MS MARCO passage ranking datasetβ145Updated last year
- GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddingsβ43Updated last year
- Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.β75Updated 3 years ago
- Dense hybrid representations for text retrievalβ62Updated 2 years ago
- Efficient Attention for Long Sequence Processingβ94Updated last year
- A multi-purpose toolkit for table-to-text generation: web interface, Python bindings, CLI commands.β55Updated last year
- β37Updated 2 years ago
- β59Updated 2 years ago
- β86Updated last month
- β54Updated 2 years ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 laβ¦β48Updated last year
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.β93Updated 2 years ago
- Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: β¦β333Updated last year
- Code accompanying the submission "Structural Text Segmentation of Legal Documents" by Aumiller et al.β96Updated last year
- Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillationβ110Updated 3 years ago
- This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' puβ¦β40Updated 3 years ago
- β42Updated 2 years ago
- Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings (EMNLP 2022 paper)β67Updated 2 years ago
- Unified Learned Sparse Retrieval Frameworkβ64Updated last year
- β97Updated 2 years ago
- This repository contains the relevant materials for the tutorial "Legal IR and NLP: the History, Challenges, and State-of-the-Art", held β¦β41Updated 2 years ago
- CLIR version of ColBERTβ67Updated 2 weeks ago
- MFAQ: a Multilingual FAQ Datasetβ17Updated last year
- Ensembling Hugging Face transformers made easyβ62Updated 2 years ago
- Inquisitive Parrots for Searchβ191Updated last year
- Automatically detect errors in annotated corpora.β47Updated last year
- Dataset for NAACL 2021 paper: "QMSum: A New Benchmark for Query-based Multi-domain Meeting Summarization"β120Updated last year
- Repo for Aspire - A scientific document similarity model based on matching fine-grained aspects of scientific papers.β52Updated last year
- Code to reproduce NeuralMind's submissions to COLIEE 2021 and COLIEE 2022β24Updated 2 years ago
- This repository contains the code for paper Prompting ELECTRA Few-Shot Learning with Discriminative Pre-Trained Models.β48Updated 2 years ago