Efficient BM25 with DuckDB 🦆
☆64Dec 20, 2024Updated last year
Alternatives and similar repositories for ducksearch
Users that are interested in ducksearch are comparing it to the libraries listed below
Sorting:
- NLP with Rust for Python 🦀🐍☆72May 13, 2025Updated 9 months ago
- Novelty detection for data streams in Python☆13Updated this week
- Neural Search☆367Mar 11, 2025Updated 11 months ago
- High-Performance Engine for Multi-Vector Search☆216Feb 26, 2026Updated last week
- Model implementation for the contextual embeddings project☆41Jun 2, 2025Updated 9 months ago
- Autoregressive Bayesian linear model☆21Sep 10, 2020Updated 5 years ago
- 🚲 Git scraping for bike sharing APIs☆31Updated this week
- The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching o…☆159Jul 14, 2025Updated 7 months ago
- ☀️ Measuring the accuracy of BBC weather forecasts in Honolulu, USA☆12Jul 10, 2021Updated 4 years ago
- Combining encoder-based language models☆11Nov 11, 2021Updated 4 years ago
- PyLate efficient inference engine☆73Jan 7, 2026Updated last month
- Late Interaction Models Training & Retrieval☆732Feb 27, 2026Updated last week
- Plug-and-play document AI with zero-shot models.☆124Feb 16, 2026Updated 2 weeks ago
- Code for Analyzing Redundancy in Pretrained Transformer Models accepted at EMNLP 2020☆14Oct 6, 2020Updated 5 years ago
- Extra functionalities for river☆14May 15, 2024Updated last year
- Demo application containing fullstack solution (frontend + backend) APIs in pure Golang.☆18Dec 5, 2024Updated last year
- ☆43Apr 22, 2025Updated 10 months ago
- bm25 is a scoring function that helps with information retrieval☆14Sep 17, 2020Updated 5 years ago
- Source code accompanying the ICLR2020 publication 'Massively Multilingual Sparse Word Representations' https://openreview.net/forum?id=Hy…☆12Aug 15, 2023Updated 2 years ago
- ☆162Dec 2, 2024Updated last year
- real time recommendation playground☆15Nov 7, 2022Updated 3 years ago
- 📈🔍 Lets Python do AB testing analysis.☆78Apr 15, 2025Updated 10 months ago
- Set up a Cost-Effective Modern Data Stack for a Charity☆19Mar 26, 2025Updated 11 months ago
- SMIT: A Simple Modality Integration Tool☆15Mar 31, 2024Updated last year
- Learning BPE embeddings by first learning a segmentation model and then training word2vec☆19Dec 18, 2022Updated 3 years ago
- Official repository for our EACL 2023 paper "LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization" (https…☆44Aug 10, 2024Updated last year
- Hugging Face RoBERTa with Flash Attention 2☆24Sep 14, 2025Updated 5 months ago
- An experiment in using DuckDB for a datalog / egg☆27Oct 11, 2023Updated 2 years ago
- 🚕 Self-contained demo using Redpanda, Materialize, River, Redis, and Streamlit to predict taxi trip durations☆45Mar 6, 2023Updated 3 years ago
- Fast and incremental explanations for online machine learning models. Works best with the river framework.☆55Dec 26, 2024Updated last year
- A PyTorch Lightning Callback for pushing models to the Hugging Face Hub 🤗⚡️☆35May 13, 2022Updated 3 years ago
- Moteur de calcul 3CL / DPE☆24Feb 13, 2025Updated last year
- The simplest way to deploy a machine learning model☆24Nov 19, 2022Updated 3 years ago
- Self-contained demo using Kafka, Materialize and Metabase to check what's streaming on Twitch. All you need is Docker and Twitch access t…☆25Mar 22, 2022Updated 3 years ago
- Code for SaGe subword tokenizer (EACL 2023)☆27Nov 30, 2024Updated last year
- XTR: Rethinking the Role of Token Retrieval in Multi-Vector Retrieval☆61Jun 20, 2024Updated last year
- Database backend support for Arquero☆24Oct 31, 2022Updated 3 years ago
- FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction☆24Jun 6, 2022Updated 3 years ago
- ☆25Apr 1, 2025Updated 11 months ago