fabiocannizzo / FastBinarySearchLinks
Fast and vectorizable algorithms for searching in a vector of sorted floating point numbers
☆153Updated 11 months ago
Alternatives and similar repositories for FastBinarySearch
Users that are interested in FastBinarySearch are comparing it to the libraries listed below
Sorting:
- High-Performance SGEMM on CUDA devices☆113Updated 10 months ago
- LLM training in simple, raw C/CUDA☆108Updated last year
- Implementation of the paper "Lossless Compression of Vector IDs for Approximate Nearest Neighbor Search" by Severo et al.☆85Updated 10 months ago
- A tracing JIT compiler for PyTorch☆13Updated 4 years ago
- Make triton easier☆49Updated last year
- Simple high-throughput inference library☆151Updated 6 months ago
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.☆319Updated this week
- Gpu benchmark☆73Updated 10 months ago
- Inference of Mamba models in pure C☆194Updated last year
- Implementation of "Efficient Multi-vector Dense Retrieval with Bit Vectors", ECIR 2024☆66Updated last month
- Clover: Quantized 4-bit Linear Algebra Library☆114Updated 7 years ago
- Nod.ai 🦈 version of 👻 . You probably want to start at https://github.com/nod-ai/shark for the product and the upstream IREE repository …☆107Updated 2 weeks ago
- Benchmarks to capture important workloads.☆31Updated 10 months ago
- CUDA-L2: Surpassing cuBLAS Performance for Matrix Multiplication through Reinforcement Learning☆142Updated last week
- FlexAttention w/ FlashAttention3 Support☆27Updated last year
- GGML implementation of BERT model with Python bindings and quantization.☆58Updated last year
- 🏙 Interactive performance profiling and debugging tool for PyTorch neural networks.☆64Updated 10 months ago
- benchmarking some transformer deployments☆26Updated 2 weeks ago
- Prepare for DeekSeek R1 inference: Benchmark CPU, DRAM, SSD, iGPU, GPU, ... with efficient code.☆73Updated 10 months ago
- ☆71Updated 8 months ago
- Effective transpose on Hopper GPU☆27Updated 3 months ago
- SynapseAI Core is a reference implementation of the SynapseAI API running on Habana Gaudi☆42Updated 10 months ago
- Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".☆279Updated 2 years ago
- A minimalistic C++ Jinja templating engine for LLM chat templates☆200Updated 2 months ago
- RWKV in nanoGPT style☆196Updated last year
- Notes and artifacts from the ONNX steering committee☆27Updated 2 weeks ago
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆47Updated 3 months ago
- ☆78Updated last year
- Inference Llama 2 in one file of pure C++☆86Updated 2 years ago
- A collection of reproducible inference engine benchmarks☆38Updated 7 months ago