ashvardanian / jaccard-indexLinks

Optimizing bit-level Jaccard Index and Population Counts for large-scale quantized Vector Search via Harley-Seal CSA and Lookup Tables

☆20

Alternatives and similar repositories for jaccard-index

Users that are interested in jaccard-index are comparing it to the libraries listed below

Sorting:

AnswerDotAI / ModernBERT-Instruct-mini-cookbook
☆48Updated 5 months ago
raphaelsty / LeNLP
NLP with Rust for Python 🦀🐍
☆63Updated 2 months ago
krypticmouse / matryoshka-representation-learning
PyTorch implementation for MRL
☆19Updated last year
taylorai / onnx_embedding_models
utilities for loading and running text embeddings with onnx
☆44Updated 11 months ago
teknium1 / transformers-gptq-quant
☆47Updated last year
Alignment-Lab-AI / datagen
a pipeline for using api calls to agnostically convert unstructured data into structured training data
☆30Updated 9 months ago
raphaelsty / neural-tree
Tree-based indexes for neural-search
☆32Updated last year
MinishLab / tokenlearn
Pre-train Static Word Embeddings
☆84Updated last month
PrithivirajDamodaran / SPLADERunner
Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created by…
☆31Updated 10 months ago
modal-labs / ci-on-modal
A sample pattern for running CI tests on Modal
☆18Updated 3 months ago
AnswerDotAI / fastkmeans
☆62Updated last week
hamelsmu / ft-drift
Check for data drift between two OpenAI multi-turn chat jsonl files.
☆37Updated last year
ChrisHayduk / QLoRA-for-MLM
QLoRA for Masked Language Modeling
☆22Updated last year
huggingface / wikirace-llms
☆23Updated 2 months ago
Knowledgator / FlashDeBERTa
Trully flash implementation of DeBERTa disentangled attention mechanism.
☆62Updated 2 months ago
lightonai / pylate-rs
PyLate efficient inference engine
☆57Updated this week
FL33TW00D / embd
GPU accelerated client-side embeddings for vector search, RAG etc.
☆66Updated last year
catid / lllm
Latent Large Language Models
☆18Updated 10 months ago
CarperAI / squeakily
A library for squeakily cleaning and filtering language datasets.
☆47Updated 2 years ago
alvarobartt / vertex-ai-huggingface-inference-toolkit
🤗 HuggingFace Inference Toolkit for Google Cloud Vertex AI (similar to SageMaker's Inference Toolkit, but for Vertex AI and unofficial)
☆17Updated last year
thomasnormal / fewshot
☆28Updated 3 weeks ago
RAIVNLab / AdANNS
Code repository for the paper - "AdANNS: A Framework for Adaptive Semantic Search"
☆65Updated last year
minosvasilias / simple_grpo
Simple GRPO scripts and configurations.
☆59Updated 5 months ago
hamelsmu / replicate-examples
☆22Updated last year
rwightman / genalog
Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and te…
☆42Updated last year
deployradiant / pychatml
Chat Markup Language conversation library
☆55Updated last year
s-smits / grpo-optuna
Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna
☆55Updated 5 months ago
kevinwu23 / StanfordFineTuneBench
☆30Updated 8 months ago
xjdr-alt / muzero_sketch
☆38Updated 11 months ago
lightonai / ducksearch
Efficient BM25 with DuckDB 🦆
☆52Updated 6 months ago