ashvardanian / SimSIMD
Up to 200x Faster Dot Products & Similarity Metrics β for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, and bit vectors using SIMD for both AVX2, AVX-512, NEON, SVE, & SVE2 π
β1,215Updated this week
Alternatives and similar repositories for SimSIMD:
Users that are interested in SimSIMD are comparing it to the libraries listed below
- Fast Open-Source Search & Clustering engine Γ for Vectors & π Strings Γ in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, Cβ¦β2,423Updated this week
- Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging NEON, AVX2, AVX-512, and SWAR to accelerate search, sort, edit dβ¦β2,364Updated last month
- Web Serving and Remote Procedure Calls at 50x lower latency and 70x higher bandwidth than FastAPI, implementing JSON-RPC & REST over io_uβ¦β1,175Updated 2 weeks ago
- Multi-Modal Database replacing MongoDB, Neo4J, and Elastic with 1 faster ACID solution, with NetworkX and Pandas interfaces, and bindingsβ¦β563Updated last year
- C++ template library for high performance SIMD based sorting algorithmsβ909Updated 2 months ago
- Lightning fast data version control system for structured and unstructured machine learning datasets. We aim to make versioning datasets β¦β912Updated this week
- An extensible, state-of-the-art columnar file formatβ1,084Updated this week
- RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-aβ¦β824Updated this week
- An efficient C++17 GPU numerical computing library with Python-like syntaxβ1,244Updated this week
- Learning how to write "Less Slow" code in C++ 20, C 99, CUDA, PTX, & Assembly, from numerics & SIMD to coroutines, ranges, exception handβ¦β293Updated this week
- Hashed Lookup Table based Matrix Multiplication (halutmatmul) - Stella Nera acceleratorβ207Updated last year
- Minimal LLM inference in Rustβ962Updated 3 months ago
- Stateful load balancer custom-tailored for llama.cpp ππ¦β676Updated last week
- β572Updated last month
- The Fast Vector Similarity Library is designed to provide efficient computation of various similarity measures between vectors.β370Updated 4 months ago
- β1,012Updated 2 months ago
- A machine learning compiler for GPUs, CPUs, and ML acceleratorsβ2,874Updated this week
- Absolute minimalistic implementation of a GPT-like transformer using only numpy (<650 lines).β250Updated last year
- The fastest hashing algorithm πβ841Updated last month
- cuVS - a library for vector search and clustering on the GPUβ278Updated this week
- Tile primitives for speedy kernelsβ1,966Updated this week
- nsync is a C library that exports various synchronization primitives, such as mutexesβ1,111Updated 6 months ago
- Fast and exact implementation of the C++ from_chars functions for number types: 4x to 10x faster than strtod, part of GCC 12, Chromium, Rβ¦β1,686Updated last week
- An easy-to-use and fast library for task-based parallelism, utilizing coroutines.β318Updated 4 months ago
- Unicode routines (UTF8, UTF16, UTF32) and Base64: billions of characters per second using SSE2, AVX2, NEON, AVX-512, RISC-V Vector Extensβ¦β1,276Updated this week
- Complete implementations from "Algorithms for Modern Hardware"β724Updated 2 years ago
- Deep learning at the speed of light.β1,505Updated 3 months ago
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.β240Updated this week
- Fast Static Symbol Table (FSST): efficient random-access string compressionβ406Updated 5 months ago
- New file format for storage of large columnar datasets.β473Updated this week