jbarrow / tinyhnsw
build your own vector database -- the littlest hnsw
☆19Updated 9 months ago
Related projects: ⓘ
- Vector Database with support for late interaction and token level embeddings.☆51Updated last week
- Implementation of "Efficient Multi-vector Dense Retrieval with Bit Vectors", ECIR 2024☆53Updated 3 weeks ago
- Binary vector search example using Unum's USearch engine and pre-computed Wikipedia embeddings from Co:here and MixedBread☆18Updated 5 months ago
- Production ready extractors for transformation, extracting embedding or structured data from unstructured data sources.☆28Updated last week
- ☆203Updated 2 months ago
- Efficient BM25 indexing using rust☆11Updated this week
- Baguetter is a flexible, efficient, and hackable search engine library implemented in Python. It's designed for quickly benchmarking, imp…☆136Updated 3 weeks ago
- Parallel Computing starter project to build GPU & CPU kernels in CUDA & C++ and call them from Python without a single line of CMake usin…☆16Updated 3 weeks ago
- Creating Generative AI Apps which work☆16Updated 2 months ago
- 🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.☆129Updated last month
- hnsw implemented by python☆18Updated 4 years ago
- Tools for various benchmarking scenarios☆24Updated this week
- This is the repo for the container that holds the models for the text2vec-transformers module☆38Updated 3 weeks ago
- GGML implementation of BERT model with Python bindings and quantization.☆51Updated 7 months ago
- Locality Sensitive Hashing☆67Updated last year
- Data extraction with LLM on CPU☆62Updated 10 months ago
- Experimentation on google's gemma model☆16Updated 6 months ago
- End-to-End Local-First Text-to-SQL Pipelines☆59Updated this week
- Check for data drift between two OpenAI multi-turn chat jsonl files.☆33Updated 5 months ago
- Chrome Extension for exploring Hugging Face datasets 🔎☆48Updated last month
- Efficient vector database for hundred millions of embeddings.☆196Updated 4 months ago
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆26Updated last year
- Benchmark study on LanceDB, an embedded vector DB, for full-text search and vector search☆17Updated 9 months ago
- numpy ufuncs for vector similarity☆14Updated 9 months ago
- Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs☆128Updated this week
- This repo is the central repo for all the RAG Evaluation reference material and partner workshop☆39Updated 2 months ago
- NLP with Rust for Python 🦀🐍☆57Updated 3 months ago
- Supervised instruction finetuning for LLM with HF trainer and Deepspeed☆32Updated last year
- Code repository for the paper - "AdANNS: A Framework for Adaptive Semantic Search"☆57Updated 11 months ago
- Tree-based indexes for neural-search☆28Updated 6 months ago