ashvardanian / SimSIMDLinks
Up to 200x Faster Dot Products & Similarity Metrics β for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, and bit vectors using SIMD for both AVX2, AVX-512, NEON, SVE, & SVE2 π
β1,410Updated 3 weeks ago
Alternatives and similar repositories for SimSIMD
Users that are interested in SimSIMD are comparing it to the libraries listed below
Sorting:
- Fast Open-Source Search & Clustering engine Γ for Vectors & Arbitrary Objects Γ in C++, C, Python, JavaScript, Rust, Java, Objective-C, Sβ¦β2,882Updated 2 weeks ago
- Multi-Modal Database replacing MongoDB, Neo4J, and Elastic with 1 faster ACID solution, with NetworkX and Pandas interfaces, and bindingsβ¦β600Updated last year
- Up to 10x faster strings for C, C++, Python, Rust, Swift & Go, leveraging NEON, AVX2, AVX-512, SVE, & SWAR to accelerate search, hashing,β¦β2,606Updated 3 weeks ago
- An efficient C++17 GPU numerical computing library with Python-like syntaxβ1,332Updated this week
- RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-aβ¦β903Updated this week
- Very fast, high quality, platform-independent hashing algorithm.β589Updated 3 weeks ago
- cuVS - a library for vector search and clustering on the GPUβ449Updated this week
- β1,042Updated last month
- C++ template library for high performance SIMD based sorting algorithmsβ949Updated last week
- Tile primitives for speedy kernelsβ2,478Updated last week
- nsync is a C library that exports various synchronization primitives, such as mutexesβ1,177Updated 2 months ago
- An extensible, state of the art columnar file format. Formerly at @spiraldb, now a Linux Foundation project.β1,313Updated this week
- Playing around "Less Slow" coding practices in C++ 20, C, CUDA, PTX, & Assembly, from numerics & SIMD to coroutines, ranges, exception haβ¦β1,797Updated last month
- Performance-portable, length-agnostic SIMD with runtime dispatchβ4,714Updated this week
- Test and benchmark suite for sort implementations.β393Updated 4 months ago
- Lightning fast data version control system for structured and unstructured machine learning datasets. We aim to make versioning datasets β¦β976Updated last week
- CUDA Core Compute Librariesβ1,711Updated this week
- SIMD Library for Evaluating Elementary Functions, vectorized libm and DFTβ732Updated 2 months ago
- Complete implementations from "Algorithms for Modern Hardware"β759Updated 2 years ago
- A fast, compressed, persistent binary data store library for C.β502Updated this week
- RAPIDS Memory Managerβ589Updated last week
- A WebGPU-accelerated ONNX inference run-time written 100% in Rust, ready for native and the webβ1,729Updated 11 months ago
- nanobind: tiny and efficient C++/Python bindingsβ2,853Updated last week
- General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). β¦β2,246Updated 2 weeks ago
- Hashed Lookup Table based Matrix Multiplication (halutmatmul) - Stella Nera acceleratorβ211Updated last year
- Deep learning at the speed of light.β1,704Updated this week
- CUDA/Metal accelerated language model inferenceβ589Updated last month
- An easy-to-use and fast library for task-based parallelism, utilizing coroutines.β325Updated 9 months ago
- Repository for nvCOMP docs and examples. nvCOMP is a library for fast lossless compression/decompression on the GPU that can be downloadeβ¦β592Updated 9 months ago
- Next-Gen Big Data File Formatβ235Updated last week