kioxia-jp / aisaq-diskannLinks
All-in-Storage Solution based on DiskANN for DRAM-free Approximate Nearest Neighbor Search
☆66Updated 2 weeks ago
Alternatives and similar repositories for aisaq-diskann
Users that are interested in aisaq-diskann are comparing it to the libraries listed below
Sorting:
- Local LLM Server with GPU and NPU Acceleration☆206Updated this week
- InferX is a Inference Function as a Service Platform☆116Updated 2 weeks ago
- Lightweight Inference server for OpenVINO☆188Updated this week
- No-code CLI designed for accelerating ONNX workflows☆201Updated last month
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆67Updated 2 weeks ago
- TensorRT-LLM server with Structured Outputs (JSON) built with Rust☆55Updated 2 months ago
- GPU Power and Performance Manager☆60Updated 9 months ago
- Intel® AI Assistant Builder☆88Updated 2 weeks ago
- Rust crates for XetHub☆43Updated 9 months ago
- Samples of good AI generated CUDA kernels☆84Updated last month
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆87Updated this week
- Turns devices into a scalable LLM platform☆150Updated last week
- Prepare for DeekSeek R1 inference: Benchmark CPU, DRAM, SSD, iGPU, GPU, ... with efficient code.☆72Updated 5 months ago
- A simple tool to anonymize LLM prompts.☆63Updated 5 months ago
- Horizon chart for CPU/GPU/Neural Engine utilization monitoring. Supports Apple M1-M4, Nvidia GPUs, AMD GPUs☆25Updated 3 weeks ago
- A daemon that automatically manages the performance states of NVIDIA GPUs.☆89Updated last month
- The HIP Environment and ROCm Kit - A lightweight open source build system for HIP and ROCm☆234Updated this week
- AI Tensor Engine for ROCm☆232Updated this week
- Wraps any OpenAI API interface as Responses with MCPs support so it supports Codex. Adding any missing stateful features. Ollama and Vllm…☆72Updated 2 weeks ago
- Vector Database with support for late interaction and token level embeddings.☆55Updated 3 weeks ago
- A C++ distributed framework for responsive Cloud applications.☆79Updated this week
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆89Updated 2 weeks ago
- High-speed and easy-use LLM serving framework for local deployment☆112Updated 4 months ago
- DCPerf benchmark suite for hyperscale cloud applications☆193Updated this week
- ☆80Updated this week
- Exploration of Vector database Index for fast approximate nearest neighbour search.☆28Updated 11 months ago
- LLM Benchmark for Throughput via Ollama (Local LLMs)☆255Updated 2 weeks ago
- Tenstorrent console based hardware information program☆45Updated last week
- 📡 Deploy AI models and apps to Kubernetes without developing a hernia☆32Updated last year
- A tshark MCP server for packet capture and analysis☆18Updated last month