lightonai / ducksearch
Efficient BM25 with DuckDB π¦
β48Updated 4 months ago
Alternatives and similar repositories for ducksearch:
Users that are interested in ducksearch are comparing it to the libraries listed below
- NLP with Rust for Python π¦πβ62Updated 11 months ago
- Tree-based indexes for neural-searchβ31Updated last year
- Check for data drift between two OpenAI multi-turn chat jsonl files.β37Updated last year
- Plug-and-play document processing pipelines. No training. Batteries included.β57Updated last week
- Pre-train Static Word Embeddingsβ58Updated 3 weeks ago
- β30Updated 2 years ago
- π€ HuggingFace Inference Toolkit for Google Cloud Vertex AI (similar to SageMaker's Inference Toolkit, but for Vertex AI and unofficial)β17Updated last year
- A library to use `modal` as a backend for `joblib`.β28Updated 3 months ago
- Multi-threaded matrix multiplication and cosine similarity calculations for dense and sparse matrices. Appropriate for calculating the K β¦β80Updated 4 months ago
- Benchmark study on LanceDB, an embedded vector DB, for full-text search and vector searchβ24Updated last year
- β35Updated 2 weeks ago
- Efficient few-shot learning with cross-encoders.β51Updated last year
- Baguetter is a flexible, efficient, and hackable search engine library implemented in Python. It's designed for quickly benchmarking, impβ¦β174Updated 8 months ago
- π’ Work with static vector modelsβ28Updated 2 weeks ago
- β67Updated 5 months ago
- Use sync mode Playwright interactively, inside a Jupyter notebookβ14Updated last month
- Fast and versatile tokenizer for language models, compatible with SentencePiece, Tokenizers, Tiktoken and more. Supports BPE, Unigram andβ¦β22Updated last month
- The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching oβ¦β131Updated 4 months ago
- spaCy entry points for Curated Transformersβ29Updated 7 months ago
- Showcase how mxbai-embed-large-v1 can be used to produce binary embedding. Binary embeddings enabled 32x storage savings and 40x faster rβ¦β17Updated last year
- Website for Applied-LLMs workβ26Updated 2 weeks ago
- Graph Engine for Exploration and Searchβ40Updated last year
- Neural Solr = Solr 9 + Mighty Inference + Nodeβ17Updated 2 years ago
- π Reference-Free automatic summarization evaluation with potential hallucination detectionβ100Updated last year
- Python API for https://vespa.ai, the open big data serving engineβ121Updated this week
- minimal pytorch implementation of bm25 (with sparse tensors)β101Updated last year
- Prototyping a question and answer bot over PDFsβ39Updated last year
- Trully flash implementation of DeBERTa disentangled attention mechanism.β46Updated 3 weeks ago
- Library for fast text representation and classification.β28Updated last year
- β43Updated 2 months ago