lightonai / ducksearch
Efficient BM25 with DuckDB π¦
β44Updated 3 months ago
Alternatives and similar repositories for ducksearch:
Users that are interested in ducksearch are comparing it to the libraries listed below
- NLP with Rust for Python π¦πβ61Updated 9 months ago
- Tree-based indexes for neural-searchβ29Updated last year
- A library to use `modal` as a backend for `joblib`.β28Updated 2 months ago
- Pre-train Static Word Embeddingsβ49Updated 2 weeks ago
- Pipeline components that support partial_fit.β45Updated 8 months ago
- Multi-threaded matrix multiplication and cosine similarity calculations for dense and sparse matrices. Appropriate for calculating the K β¦β79Updated 2 months ago
- Check for data drift between two OpenAI multi-turn chat jsonl files.β38Updated 11 months ago
- Chrome Extension for exploring Hugging Face datasets πβ49Updated 6 months ago
- My NER Experiments with ModernBERTβ17Updated 2 months ago
- spaCy entry points for Curated Transformersβ27Updated 5 months ago
- hnsw implemented by pythonβ19Updated 5 years ago
- Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created byβ¦β29Updated 7 months ago
- Benchmark study on LanceDB, an embedded vector DB, for full-text search and vector searchβ23Updated last year
- Neural Solr = Solr 9 + Mighty Inference + Nodeβ17Updated 2 years ago
- Python package for deduplication/entity resolution using active learningβ76Updated 7 months ago
- Source code and data for Like a Good Nearest Neighborβ28Updated 2 months ago
- Python package for extractive NLP using the OpenAI APIβ17Updated 6 months ago
- Have UV deal with all your Jupyter deps.β24Updated 6 months ago
- MoodCatπΌ classifies the mood of English sentences.β14Updated 2 years ago
- It's a cooler way to store simple linear models.β28Updated 8 months ago
- β58Updated 4 months ago
- Showcase how mxbai-embed-large-v1 can be used to produce binary embedding. Binary embeddings enabled 32x storage savings and 40x faster rβ¦β15Updated last year
- Inference engine for GLiNER models, in Rustβ43Updated last week
- β131Updated 2 months ago
- A utility for labeling clusters of text data.β28Updated 3 years ago
- π€ HuggingFace Inference Toolkit for Google Cloud Vertex AI (similar to SageMaker's Inference Toolkit, but for Vertex AI and unofficial)β17Updated last year
- Use sync mode Playwright interactively, inside a Jupyter notebookβ15Updated 3 months ago
- Baguetter is a flexible, efficient, and hackable search engine library implemented in Python. It's designed for quickly benchmarking, impβ¦β173Updated 6 months ago