lightonai / ducksearchLinks

Efficient BM25 with DuckDB 🦆

☆53

Alternatives and similar repositories for ducksearch

Users that are interested in ducksearch are comparing it to the libraries listed below

Sorting:

raphaelsty / LeNLP
NLP with Rust for Python 🦀🐍
☆64Updated 2 months ago
MinishLab / tokenlearn
Pre-train Static Word Embeddings
☆85Updated 2 months ago
adrinjalali / joblib-modal
A library to use `modal` as a backend for `joblib`.
☆29Updated 6 months ago
MantisAI / sieves
Plug-and-play document processing pipelines with zero-shot models.
☆85Updated this week
lightonai / fast-plaid
High-Performance Engine for Multi-Vector Search
☆132Updated last month
raphaelsty / neural-tree
Tree-based indexes for neural-search
☆32Updated last year
mixedbread-ai / baguetter
Baguetter is a flexible, efficient, and hackable search engine library implemented in Python. It's designed for quickly benchmarking, imp…
☆186Updated 11 months ago
AnswerDotAI / playwrightnb
Use sync mode Playwright interactively, inside a Jupyter notebook
☆15Updated 4 months ago
lightonai / pylate-rs
PyLate efficient inference engine
☆62Updated 2 weeks ago
AnswerDotAI / fastkmeans
☆63Updated last month
Snowflake-Labs / arctic-embed
☆75Updated 7 months ago
Pleias / OCRoscope
Small python package to measure OCR quality and other related metrics.
☆25Updated last year
jxmorris12 / bm25_pt
minimal pytorch implementation of bm25 (with sparse tensors)
☆104Updated last year
koaning / icepickle
It's a cooler way to store simple linear models.
☆27Updated last year
UKPLab / eacl2024-lagonn
Source code and data for Like a Good Nearest Neighbor
☆29Updated 6 months ago
TutteInstitute / evoc
Embedding Vector Oriented Clustering
☆149Updated 3 months ago
eugeneyan / align-app
☆77Updated 8 months ago
explosion / spacy-curated-transformers
spaCy entry points for Curated Transformers
☆32Updated 2 months ago
Knowledgator / FlashDeBERTa
Trully flash implementation of DeBERTa disentangled attention mechanism.
☆62Updated 2 months ago
mixedbread-ai / batched
The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching o…
☆142Updated 3 weeks ago
rragundez / chunkdot
Multi-threaded matrix multiplication and cosine similarity calculations for dense and sparse matrices. Appropriate for calculating the K …
☆83Updated 7 months ago
AnswerDotAI / ModernBERT-Instruct-mini-cookbook
☆49Updated 5 months ago
ashvardanian / jaccard-index
Optimizing bit-level Jaccard Index and Population Counts for large-scale quantized Vector Search via Harley-Seal CSA and Lookup Tables
☆20Updated 2 months ago
flairNLP / fabricator
[EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.
☆108Updated last year
fritshermans / deduplipy
Python package for deduplication/entity resolution using active learning
☆81Updated 11 months ago
wjbmattingly / spacyex
SpaCyEx allows the creation of spaCy Matcher patterns with RegEx like syntax.
☆59Updated last year
koaning / spacy-report
Generate reports for spaCy models.
☆29Updated 3 years ago
Muhtasham / summarization-eval
📝 Reference-Free automatic summarization evaluation with potential hallucination detection
☆101Updated last year
kpu / fasterText
Library for fast text representation and classification.
☆30Updated last year
pmbaumgartner / spacy-setfit-textcat
☆30Updated 3 years ago