nreimers / flax-sentence-embeddings
Shared code for training sentence embeddings with Flax / JAX
☆27Updated 3 years ago
Alternatives and similar repositories for flax-sentence-embeddings:
Users that are interested in flax-sentence-embeddings are comparing it to the libraries listed below
- ☆21Updated 3 years ago
- ☆45Updated 2 years ago
- Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.☆74Updated 3 years ago
- ☆74Updated 3 years ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆93Updated 2 years ago
- Dense hybrid representations for text retrieval☆62Updated last year
- ☆96Updated 2 years ago
- ☆36Updated 2 years ago
- ☆55Updated 2 years ago
- ☆84Updated 6 months ago
- INCOME: An Easy Repository for Training and Evaluation of Index Compression Methods in Dense Retrieval. Includes BPR and JPQ.☆22Updated last year
- Official code and model checkpoints for our EMNLP 2022 paper "RankGen - Improving Text Generation with Large Ranking Models" (https://arx…☆136Updated last year
- Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"☆26Updated 3 years ago
- Code and Data for Evaluation WG☆41Updated 2 years ago
- Implementation of Marge, Pre-training via Paraphrasing, in Pytorch☆75Updated 4 years ago
- This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' pu…☆40Updated 3 years ago
- SIGIR 2021: Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling☆59Updated 3 years ago
- Hyperparameter Search for AllenNLP☆135Updated last month
- Binary Passage Retriever (BPR) - an efficient passage retriever for open-domain question answering☆168Updated 3 years ago
- Cross language information retrieval pipeline☆18Updated last year
- SPRINT Toolkit helps you evaluate diverse neural sparse models easily using a single click on any IR dataset.☆44Updated last year
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.☆77Updated 5 months ago
- This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”☆85Updated 2 years ago
- State of the art Semantic Sentence Embeddings☆98Updated 2 years ago
- Automatic metrics for GEM tasks☆64Updated 2 years ago
- ☆18Updated last year
- Implementation of the paper 'Sentence Bottleneck Autoencoders from Transformer Language Models'☆17Updated 2 years ago
- Code and data to support the paper "PAQ 65 Million Probably-Asked Questions andWhat You Can Do With Them"☆202Updated 3 years ago
- ☆67Updated 2 years ago
- EMNLP 2021 - CTC: A Unified Framework for Evaluating Natural Language Generation☆96Updated last year