neelsomani/kv-marketplace

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/neelsomani/kv-marketplace)

neelsomani / kv-marketplace

Cross-GPU KV Cache Marketplace

☆26

Alternatives and similar repositories for kv-marketplace

Users that are interested in kv-marketplace are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

thunlp / NOSA
View on GitHub
The official implementation of NOSA
☆19Jun 11, 2026Updated last month
Venkat2811 / myelon
View on GitHub
Ultra-low-latency, high-throughput multiprocess transport over SHM and mmap. LMAX-Disruptor-style cross-process ring substrate.
☆17Updated this week
OscarXZQ / delta_activations
View on GitHub
Official code release for Delta Activations: A Representation for Finetuned Large Language Models
☆20Sep 5, 2025Updated 10 months ago
namtranase / terminalmind
View on GitHub
Friendly Terminal Assistant for Developers
☆17Mar 23, 2024Updated 2 years ago
chenjianhuii / Mechanistic-Data-Attribution
View on GitHub
☆16May 25, 2026Updated last month
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
IBM / activated-lora
View on GitHub
Source code for Activated LoRA
☆26Updated this week
gigit0000 / qwen3.cu
View on GitHub
Single-file, pure CUDA C implementation for running inference on Qwen3 0.6B GGUF. No Dependencies.
☆24Nov 26, 2025Updated 7 months ago
jwkirchenbauer / mtp-lm
View on GitHub
Source code to accompany research paper on training multi token prediction language models using self-distillation.
☆39Feb 21, 2026Updated 5 months ago
Pavankunchala / Reinforcement-learning-with-verifable-rewards-Learnings
View on GitHub
RLVR Testing and Training
☆22Aug 28, 2025Updated 10 months ago
deadshot465 / novelcrafter-mcp
View on GitHub
An experimental desktop client for using Claude Desktop's MCP with Novelcrafter codices.
☆11Dec 3, 2024Updated last year
itsPreto / baby-code
View on GitHub
100% Private & Simple. OSS 🐍 Code Interpreter for LLMs 🦙
☆34Aug 29, 2023Updated 2 years ago
stanford-oval / sliders
View on GitHub
Repository for paper: Contexts are Never Long Enough: Structured Reasoning for Scalable Question Answering over Long Document Sets
☆27Apr 27, 2026Updated 2 months ago
ghadiaravi13 / Untied-Ulysses
View on GitHub
☆24May 23, 2026Updated last month
oliverhu / rama
View on GitHub
llama2 inference engine in Rust
☆13Apr 12, 2024Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
lacoco-lab / decompiling_transformers
View on GitHub
Repo for Paper: Discovering Interpretable Algorithms by Decompiling Transformers to RASP
☆15May 25, 2026Updated last month
tudasc / PIRA
View on GitHub
PIRA - Automatic Instrumentation Refinement
☆17Mar 28, 2024Updated 2 years ago
BAI-LAB / MoE-CL
View on GitHub
[WWW 2026 Oral] MoE-CL:Self-Evolving LLMs via Continual Instruction Tuning
☆21Dec 1, 2025Updated 7 months ago
TransluceAI / introspective-interp
View on GitHub
Repository for "Training Language Models To Explain Their Own Computations"
☆23Jul 7, 2026Updated 2 weeks ago
keeeeenw / TinyLlama
View on GitHub
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
☆14Mar 30, 2024Updated 2 years ago
syntacore / rvv-simulator
View on GitHub
RISC-V vector extension ISA simulation
☆18Jun 11, 2019Updated 7 years ago
neuraloperator / NNs-to-NOs
View on GitHub
☆23May 21, 2026Updated last month
jacobfa / mot
View on GitHub
☆15Sep 25, 2025Updated 9 months ago
cpldcpu / LRMTokenEconomy
View on GitHub
Measuring Thinking Efficiency in Reasoning Models - Research Repository
☆39Dec 2, 2025Updated 7 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Chaoses-Ib / ConcurrentComputing
View on GitHub
☆15Jul 13, 2026Updated last week
yaof20 / DenseMixer
View on GitHub
Official implementation for DenseMixer: Improving MoE Post-Training with Precise Router Gradient
☆67Aug 3, 2025Updated 11 months ago
curvedinf / novel-writer
View on GitHub
Automated LLM novelist
☆46Apr 11, 2024Updated 2 years ago
mahyar-jahaninasab / Feature-Enforcing-PINN
View on GitHub
Enhancing the convergence speed by 2x and improving the training success of Physics-Informed Neural Networks (PINNs).
☆13Oct 14, 2024Updated last year
Nero10578 / LLM-Inference-Benchmark
View on GitHub
☆14Aug 25, 2024Updated last year
one-covenant / SparseLoCo
View on GitHub
CCLoco: Scaling Up Top-K Error Feedback with Local Optimizers
☆26Aug 22, 2025Updated 10 months ago
SvTPM-impl / SvTPM
View on GitHub
vTPM with SGX protection
☆12May 30, 2019Updated 7 years ago
matatonic / openedai-images
View on GitHub
An OpenAI API compatible images server to generate or manipulate images.
☆18Feb 2, 2025Updated last year
danielemalitesta / Multimodal-DL-4-RecSys
View on GitHub
Official GitHub repository of the lecture "Multimodal Deep Learning for Recommendation", at the 2024 ACM RecSys Summer School
☆12Oct 12, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
CentML / lorafusion
View on GitHub
LoRAFusion: Efficient LoRA Fine-Tuning for LLMs
☆28Jul 2, 2026Updated 2 weeks ago
Al-aminI / GraphMem
View on GitHub
Production-Grade Agent Memory Framework for Agentic AI
☆16Apr 15, 2026Updated 3 months ago
bashalarmist / hello-ooba
View on GitHub
Oobabooga "Hello World" API example for node.js with Express
☆13Jul 2, 2023Updated 3 years ago
pramod-zillella / Skin-Lesion-Segmentation
View on GitHub
Fully automatic skin lesion segmentation using the Berkeley wavelet transform and UNet algorithm.
☆12Jun 1, 2021Updated 5 years ago
monk1337 / auto-ollama
View on GitHub
run ollama & gguf easily with a single command
☆52May 15, 2024Updated 2 years ago
rayures / vTPM
View on GitHub
libtpms / swtpm software emulation of a Trusted Platform Module (TPM 1.2 and TPM 2.0) compile script
☆13Sep 16, 2020Updated 5 years ago
Venkat2811 / wombatkv
View on GitHub
Object-storage-native KV cache for LLM inference & RL. Cross-restart, cross-conversation, cross-engine via shared S3 bucket.
☆19Jul 6, 2026Updated 2 weeks ago