Lightning-fast LLM inference engine - Built with Rust (inspiration from https://github.com/GeeeekExplorer/nano-vllm)
☆35Jun 24, 2025Updated 11 months ago
Alternatives and similar repositories for nano-vllm-rs
Users that are interested in nano-vllm-rs are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Ocaml code from Writing an Interpreter in Go☆11Aug 16, 2019Updated 6 years ago
- [MobiCom 24] Adaptive DNN inference under memory constraints☆57Jan 22, 2025Updated last year
- A cpp threadpool for c++11 c++14 c++17 c++20☆15Jun 30, 2023Updated 2 years ago
- 🦙🦙.🦀☆28Sep 24, 2023Updated 2 years ago
- A RPC Server implement base on Raft Paper in Golang☆10Jun 17, 2016Updated 9 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Compiler for the Tiger programming language☆12Oct 27, 2018Updated 7 years ago
- 2023秋PKU编译原理lab,以及Koopa IR C++接口的文档☆16Feb 12, 2024Updated 2 years ago
- ☆36Feb 9, 2024Updated 2 years ago
- MLIR dialect for libgccjit☆24Dec 3, 2024Updated last year
- Llama causal LM fully recreated in LibTorch. Designed to be used in Unreal Engine 5☆16Sep 19, 2024Updated last year
- To better understand the ggml library☆27Jun 13, 2025Updated 11 months ago
- Lab 5 project of MIT-6.5940, deploying LLaMA2-7B-chat on one's laptop with TinyChatEngine.☆18Dec 1, 2023Updated 2 years ago
- Gallatin is a general-purpose memory manager for CUDA that allows for threads to quickly malloc and free memory of arbitrary size inside …☆27May 25, 2026Updated last week
- A set of tools that make working with the Scala ecosystem even better.☆13Apr 4, 2026Updated last month
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- An easy-use coroutine lib implement by C++ coroutine and liburing☆32Jun 6, 2025Updated 11 months ago
- an attempt at implementing the PBFT algorithm in Go☆12Nov 15, 2016Updated 9 years ago
- Train a tiny LLaMA model from scratch to repeat your words using Reinforcement Learning from Human Feedback (RLHF)☆18May 23, 2024Updated 2 years ago
- Spark—Python学习笔记☆11Sep 25, 2018Updated 7 years ago
- 简单的字典翻译组件☆10Mar 18, 2024Updated 2 years ago
- Modern compiler implementation in ML, in Haskell☆16Apr 4, 2018Updated 8 years ago
- 用 typescript 编写的 flappy bird H5 小游戏☆28Jan 7, 2023Updated 3 years ago
- A C compiler with SSA-based backend optimzation☆15Mar 19, 2016Updated 10 years ago
- Postgres protocol support for finagle☆36Sep 4, 2013Updated 12 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- zero cost Apply/Applicative syntax☆13May 17, 2026Updated 2 weeks ago
- Official Java Client for Ipregistry, a Fast, Reliable IP Geolocation and Threat Data API.☆16Dec 7, 2025Updated 5 months ago
- SmartBuf is a cross-language serialization and deserialization framework, and it has high performance and compression ratio like …☆11Dec 5, 2023Updated 2 years ago
- OpenAI compatible API for open source LLMs☆17Oct 30, 2023Updated 2 years ago
- A C compiler written in Kotlin☆19Jun 22, 2024Updated last year
- Composable data loading for Ruby☆13Apr 8, 2026Updated last month
- The final project of PKU course Compilers: Principles in Spring 2023, a SysY to RISC-V compiler. Document::https://pku-minic.github.io/on…☆21Jun 15, 2023Updated 2 years ago
- Code from Learn You A Haskell book: http://learnyouahaskell.com/☆17Aug 19, 2012Updated 13 years ago
- a toy go compiler written in go☆22Dec 18, 2010Updated 15 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- VecoLuc is a scalable vector search engine that leverages Apache Lucene and the JDK's incubator vector API for high-performance vector op…☆11Aug 22, 2024Updated last year
- sh - Super fast Alfred 3+ workflow to search through Chrome history 🕵️♀️☆13Nov 28, 2020Updated 5 years ago
- Wrap RocksDB inside a server talks like the REDIS.☆31Feb 20, 2014Updated 12 years ago
- human in the loop in dify workflow by plugin☆16Jan 7, 2025Updated last year
- Simple Rust binding for the JNI (Java Native Interface) API.☆12May 24, 2026Updated last week
- special node-red node used in Distrbuted Node-RED☆16May 14, 2019Updated 7 years ago
- A Tiger compiler written in SML.☆13May 7, 2015Updated 11 years ago