Neural information retrieval / Semantic search / Bi-encoders
☆174Aug 5, 2023Updated 2 years ago
Alternatives and similar repositories for information-retrieval
Users that are interested in information-retrieval are comparing it to the libraries listed below
Sorting:
- Combining encoder-based language models☆11Nov 11, 2021Updated 4 years ago
- Open-Source Information Retrieval Courses @ TU Wien☆698Jun 12, 2023Updated 2 years ago
- Submission archive for the MS MARCO passage ranking leaderboard☆13Apr 21, 2023Updated 2 years ago
- The official repository for the LREC 2022 paper "D3: A Massive Dataset of Scholarly Metadata for Analyzing the State of Computer Science …☆29Nov 25, 2022Updated 3 years ago
- [WIP] Behold, semantic-search, built over sentence-transformers to make it easy for search engineers to evaluate, optimise and deploy mod…☆15Apr 21, 2023Updated 2 years ago
- ☆21Sep 6, 2021Updated 4 years ago
- Code accompanying the paper Pretraining Language Models with Human Preferences☆180Feb 13, 2024Updated 2 years ago
- Training & evaluation library for text-based neural re-ranking and dense retrieval models built with PyTorch☆265Jan 27, 2023Updated 3 years ago
- Arabic News Stance Corpus☆11Feb 5, 2021Updated 5 years ago
- provides a common interface to many IR measure tools☆96Feb 17, 2026Updated 2 weeks ago
- Tool for comparing two ranked lists (TREC run files)☆20Nov 9, 2022Updated 3 years ago
- Our open source implementation of MiniLMv2 (https://aclanthology.org/2021.findings-acl.188)☆61Jun 12, 2023Updated 2 years ago
- Token-free Language Modeling with ByGPT5 & Friends!☆12Jul 18, 2025Updated 7 months ago
- SQuARE: Software for question answering research.☆75Jun 25, 2024Updated last year
- Machine learning prediction of movies genres using Gensim's Doc2Vec and PyMongo - (Python, MongoDB)☆37Dec 8, 2022Updated 3 years ago
- Scalable training for dense retrieval models.☆298Jun 10, 2025Updated 8 months ago
- Accurate word segmentation for hashtags and text, powered by Transformers and Beam Search. A scalable alternative to heuristic splitters …☆77Jan 8, 2026Updated 2 months ago
- Official Repository of Six Dragons Fly Again (ISMIR 2024)☆13Nov 13, 2025Updated 3 months ago
- ☆13Jun 2, 2022Updated 3 years ago
- Fast search index for SPLADE sparse retrieval models implemented in Python using Numpy and Numba☆36Oct 16, 2025Updated 4 months ago
- opentqa is a open framework of the textbook question answering, which includes xtqa, mcan, cmr, mfb, mutan.☆11Mar 27, 2021Updated 4 years ago
- ☆24Oct 8, 2024Updated last year
- ☆31Sep 7, 2023Updated 2 years ago
- A multilingual version of MS MARCO passage ranking dataset☆147Oct 19, 2023Updated 2 years ago
- Official code repository for "Exploring Neural Models for Query-Focused Summarization".☆51Jun 12, 2023Updated 2 years ago
- This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with enti…☆244Jun 19, 2023Updated 2 years ago
- 基于Paddle进行语义检索并部署上线,支持多语言 This code is based on Paddle to do a semantic search, and deploy it. Multilingual support☆13Aug 11, 2022Updated 3 years ago
- TDCleaner: A Tool for Detecting Obsolete TODO Comments in Software Repos☆12Dec 9, 2021Updated 4 years ago
- A Graph-based Pattern Representations Tutorial☆10Jul 15, 2019Updated 6 years ago
- DALLE-tools provided useful dataset utilities to improve you workflow with WebDatasets.☆14Mar 9, 2022Updated 4 years ago
- Code for "Incorporating Relevance Feedback for Information-Seeking Retrieval using Few-Shot Document Re-Ranking" (https://arxiv.org/abs/2…☆14Feb 2, 2026Updated last month
- Official repository of Myna: Masking-Based Contrastive Learning of Musical Representations☆17Mar 31, 2025Updated 11 months ago
- Fine-tuning Quantized Neural Networks with Zeroth-order Optimization☆16Sep 17, 2025Updated 5 months ago
- ☆12Apr 29, 2022Updated 3 years ago
- SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batchi…☆35May 24, 2024Updated last year
- A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.☆2,095Oct 16, 2025Updated 4 months ago
- Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.☆2,026Updated this week
- A library for building and serving multi-node distributed faiss indices.☆276Nov 1, 2023Updated 2 years ago
- Active Learning for Text Classification in Python☆639Feb 1, 2026Updated last month