rragundez / chunkdotLinks

Multi-threaded matrix multiplication and cosine similarity calculations for dense and sparse matrices. Appropriate for calculating the K most similar items for a large number of items by chunking the item matrix representation (embeddings) and using Numba to accelerate the calculations.

☆83

Alternatives and similar repositories for chunkdot

Users that are interested in chunkdot are comparing it to the libraries listed below

Sorting:

Knowledgator / GLiClass
Generalist and Lightweight Model for Text Classification
☆148Updated last month
cohere-ai / DiskVectorIndex
☆210Updated last month
masci / banks
LLM prompt language based on Jinja. Banks provides tools and functions to build prompts text and chat messages from generic blueprints. I…
☆113Updated 2 weeks ago
vespa-engine / pyvespa
Python API for https://vespa.ai, the open big data serving engine
☆133Updated this week
hamelsmu / ft-drift
Check for data drift between two OpenAI multi-turn chat jsonl files.
☆37Updated last year
MinishLab / tokenlearn
Pre-train Static Word Embeddings
☆85Updated 2 months ago
PrithivirajDamodaran / SPLADERunner
Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created by…
☆32Updated 11 months ago
davanstrien / data-for-fine-tuning-llms
☆77Updated last year
davidberenstein1957 / dataset-viber
Dataset Viber is your chill repo for data collection, annotation and vibe checks.
☆47Updated 10 months ago
henrikalbihn / gliner-as-a-service
GLiNER model in a FastAPI microservice.
☆45Updated 7 months ago
jina-ai / correlations
Simple UI for debugging correlations of text embeddings
☆288Updated 2 months ago
Knowledgator / LiqFit
Efficient few-shot learning with cross-encoders.
☆56Updated last year
IlyasMoutawwakil / py-txi
A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.
☆33Updated 2 months ago
aniketmaurya / fastserve-ai
Machine Learning Serving focused on GenAI with simplicity as the top priority.
☆59Updated 3 weeks ago
mixedbread-ai / baguetter
Baguetter is a flexible, efficient, and hackable search engine library implemented in Python. It's designed for quickly benchmarking, imp…
☆186Updated 11 months ago
louisbrulenaudet / ragoon
High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡
☆66Updated 8 months ago
MantisAI / sieves
Plug-and-play document processing pipelines with zero-shot models.
☆85Updated 2 months ago
IBM / fastfit
FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes
☆210Updated 2 months ago
PrithivirajDamodaran / Route0x
Low latency, High Accuracy, Custom Query routers for Humans and Agents. Built by Prithivi Da
☆108Updated 4 months ago
EveripediaNetwork / fastc
Unattended Lightweight Text Classifiers with LLM Embeddings
☆185Updated 10 months ago
lightonai / pylate-rs
PyLate efficient inference engine
☆61Updated 2 weeks ago
flowaicom / flow-judge
Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafte…
☆76Updated 9 months ago
Muhtasham / summarization-eval
📝 Reference-Free automatic summarization evaluation with potential hallucination detection
☆101Updated last year
flairNLP / fabricator
[EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.
☆108Updated last year
raphaelsty / LeNLP
NLP with Rust for Python 🦀🐍
☆64Updated 2 months ago
premAI-io / benchmarks
🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.
☆137Updated last year
davidberenstein1957 / fast-sentence-transformers
Simply, faster, sentence-transformers
☆143Updated 11 months ago
qdrant / bm42_eval
Evaluation of bm42 sparse indexing algorithm
☆68Updated last year
mrmps / ai-chunker
Chunk your text using gpt4o-mini more accurately
☆44Updated 11 months ago
stephenleo / llm-structured-output-benchmarks
Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on task…
☆173Updated 10 months ago