ashvardanian / SimSIMD
Up to 200x Faster Dot Products & Similarity Metrics — for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, and bit vectors using SIMD for both AVX2, AVX-512, NEON, SVE, & SVE2 📐
☆1,343Updated 3 weeks ago
Alternatives and similar repositories for SimSIMD:
Users that are interested in SimSIMD are comparing it to the libraries listed below
- Fast Open-Source Search & Clustering engine × for Vectors & 🔜 Strings × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C…☆2,655Updated last week
- Up to 10x faster strings for C, C++, Python, Rust, Swift & Go, leveraging NEON, AVX2, AVX-512, SVE, & SWAR to accelerate search, hashing,…☆2,516Updated last week
- Web Serving and Remote Procedure Calls at 50x lower latency and 70x higher bandwidth than FastAPI, implementing JSON-RPC & REST over io_u…☆1,209Updated 3 months ago
- An efficient C++17 GPU numerical computing library with Python-like syntax☆1,313Updated this week
- RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-a…☆870Updated this week
- An extensible, state of the art columnar file format☆1,188Updated this week
- cuVS - a library for vector search and clustering on the GPU☆382Updated this week
- C++ template library for high performance SIMD based sorting algorithms☆927Updated this week
- Complete implementations from "Algorithms for Modern Hardware"☆748Updated 2 years ago
- ☆577Updated this week
- CUDA Core Compute Libraries☆1,610Updated this week
- The Fast Vector Similarity Library is designed to provide efficient computation of various similarity measures between vectors.☆385Updated last month
- Hashed Lookup Table based Matrix Multiplication (halutmatmul) - Stella Nera accelerator☆209Updated last year
- Playing around "Less Slow" coding practices in C++ 20, C, CUDA, PTX, & Assembly, from numerics & SIMD to coroutines, ranges, exception ha…☆1,624Updated this week
- Test and benchmark suite for sort implementations.☆387Updated last month
- Fast and exact implementation of the C++ from_chars functions for number types: 4x to 10x faster than strtod, part of GCC 12, Chromium, R…☆1,740Updated last month
- RAPIDS Memory Manager☆572Updated last week
- A collection of lock-free data structures written in standard C++11☆874Updated 3 months ago
- ☆1,273Updated last year
- ☆537Updated this week
- Fast, SQL powered, in-process vector search for any language with an SQLite driver☆297Updated 5 months ago
- Stateful load balancer custom-tailored for llama.cpp 🏓🦙☆744Updated this week
- Multi-Threaded FP32 Matrix Multiplication on x86 CPUs☆347Updated this week
- nanobind: tiny and efficient C++/Python bindings☆2,738Updated this week
- Implementations of SIMD instruction sets for systems which don't natively support them.☆2,640Updated 2 weeks ago
- Performance-portable, length-agnostic SIMD with runtime dispatch☆4,564Updated this week
- Lightning fast data version control system for structured and unstructured machine learning datasets. We aim to make versioning datasets …☆930Updated last week
- Framework for evaluating ANNS algorithms on billion scale datasets.☆373Updated last week
- ☆242Updated last year
- A WebGPU-accelerated ONNX inference run-time written 100% in Rust, ready for native and the web☆1,727Updated 9 months ago