kioxia-jp / aisaq-diskannLinks
All-in-Storage Solution based on DiskANN for DRAM-free Approximate Nearest Neighbor Search
☆69Updated last month
Alternatives and similar repositories for aisaq-diskann
Users that are interested in aisaq-diskann are comparing it to the libraries listed below
Sorting:
- InferX is a Inference Function as a Service Platform☆123Updated 2 weeks ago
- No-code CLI designed for accelerating ONNX workflows☆207Updated 2 months ago
- Lemonade helps users run local LLMs with the highest performance by configuring state-of-the-art inference engines for their NPUs and GPU…☆381Updated this week
- Lightweight Inference server for OpenVINO☆193Updated this week
- Prepare for DeekSeek R1 inference: Benchmark CPU, DRAM, SSD, iGPU, GPU, ... with efficient code.☆73Updated 6 months ago
- Rust crates for XetHub☆51Updated 9 months ago
- AI Tensor Engine for ROCm☆249Updated this week
- Minimal Linux OS with a Model Context Protocol (MCP) gateway to expose local capabilities to LLMs.☆260Updated last month
- Wraps any OpenAI API interface as Responses with MCPs support so it supports Codex. Adding any missing stateful features. Ollama and Vllm…☆81Updated last month
- High-performance safetensors model loader☆53Updated 3 weeks ago
- ☆59Updated last year
- Tenstorrent console based hardware information program☆49Updated this week
- DIS: blockDevice over Immutable Storage☆70Updated 3 years ago
- Bamboo-7B Large Language Model☆93Updated last year
- This is the documentation repository for SGLang. It is auto-generated from https://github.com/sgl-project/sglang/tree/main/docs.☆68Updated this week
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆67Updated last month
- LLM Inference on consumer devices☆123Updated 4 months ago
- High-speed and easy-use LLM serving framework for local deployment☆115Updated this week
- Horizon chart for CPU/GPU/Neural Engine utilization monitoring. Supports Apple M1-M4, Nvidia GPUs, AMD GPUs☆26Updated 2 weeks ago
- Samples of good AI generated CUDA kernels☆88Updated 2 months ago
- Lightweight daemon for monitoring CUDA runtime API calls with eBPF uprobes☆122Updated 4 months ago
- Build userspace NVMe drivers and storage applications with CUDA support☆384Updated last year
- A tshark MCP server for packet capture and analysis☆18Updated last month
- A minimalistic C++ Jinja templating engine for LLM chat templates☆164Updated this week
- DCPerf benchmark suite for hyperscale cloud applications☆197Updated this week
- TPI-LLM: Serving 70b-scale LLMs Efficiently on Low-resource Edge Devices☆186Updated 2 months ago
- The HIP Environment and ROCm Kit - A lightweight open source build system for HIP and ROCm☆285Updated this week
- ☆196Updated 3 months ago
- Let LLMs control embedded devices via the Model Context Protocol.☆144Updated last month
- A library for constructing allocators and memory pools. It also contains broadly useful abstractions and utilities for memory management.…☆67Updated this week