ubermenchh/mini-vllm

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ubermenchh/mini-vllm)

ubermenchh / mini-vllm

☆21

Alternatives and similar repositories for mini-vllm

Users that are interested in mini-vllm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Hemanthkumar2112 / Reward-Modeling-RLHF-Finetune-and-RAG
View on GitHub
Gemma2(9B), Llama3-8B-Finetune-and-RAG, code base for sample, implemented in Kaggle platform
☆22Feb 8, 2025Updated last year
tanzelin430 / libsmctrl
View on GitHub
libsmctrl论文的复现，添加了python端接口，可以在python端灵活调用接口来分配计算资源
☆12May 21, 2024Updated 2 years ago
TonyTangYu / delta-examples
View on GitHub
☆12Apr 30, 2024Updated 2 years ago
akshitgautam42 / AskYourPDF
View on GitHub
Ask question to your PDF
☆10Jun 11, 2023Updated 3 years ago
pilancilab / matrix-compressor
View on GitHub
Implementation of LPLR algorithm for matrix compression
☆33Nov 21, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
deepneuralmachine / seq2act-tensorflow
View on GitHub
Seq2act: Mapping Natural Language Instructions to Mobile UI Action Sequences from Google research
☆15Jul 13, 2020Updated 6 years ago
nano-o / MultiPaxos
View on GitHub
MultiPaxos and Disk Paxos in TLA+ and PlusCal
☆13Jan 23, 2023Updated 3 years ago
temporal-hpc / reduction-tensor-cores
View on GitHub
Fast GPU based tensor core reductions
☆12Jan 13, 2023Updated 3 years ago
WujiangXu / MemGym
View on GitHub
The code for paper "MemGym: a Long-Horizon Memory Environment for LLM Agents".
☆18Jun 2, 2026Updated last month
digoal / gp_tpch
View on GitHub
☆13Sep 3, 2018Updated 7 years ago
rabiulcste / vqazero
View on GitHub
visual question answering prompting recipes for large vision-language models
☆29Sep 14, 2024Updated last year
VadimSokolov / dl-traffic
View on GitHub
Code for the "Deep Learning for Short-Term Traffic Flow Prediction" paper (https://arxiv.org/abs/1604.04527)
☆12Apr 12, 2017Updated 9 years ago
caiwanxianhust / flash-attention-opt
View on GitHub
flash attention 优化日志
☆31Jun 4, 2025Updated last year
guobbin / PFL-MoE
View on GitHub
Federated Learning - PyTorch
☆15Jun 27, 2021Updated 5 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
junminchen / PhyNEO
View on GitHub
For advanced physics-driven combined with neural network enhancement force field.
☆19Mar 9, 2026Updated 4 months ago
gty111 / GEMM_WMMA
View on GitHub
GEMM by WMMA (tensor core)
☆15Jul 31, 2022Updated 3 years ago
stogiannidis / OCaTS
View on GitHub
[EMNLP 2023] Source code for Cache me if you Can: an Online Cost-aware Teacher-Student framework to Reduce the Calls to Large Language M…
☆17Jun 13, 2025Updated last year
XiaoduoAILab / ECom-Bench
View on GitHub
☆22Sep 29, 2025Updated 9 months ago
ictlyh / SourceCodeAnalysis
View on GitHub
Source code analysis of Impala, PostgreSQL, Citus and Postgres-XL
☆12Jan 16, 2017Updated 9 years ago
lyric12345 / The-Hundred-Page-Machine-Learning-Book-by-Andriy-Burkov
View on GitHub
Official website of the book: http://themlbook.com/
☆13Feb 10, 2019Updated 7 years ago
orthecreedence / cl-libevent2
View on GitHub
Low level Common Lisp bindings for Libevent2
☆19Jan 5, 2019Updated 7 years ago
dongqianwei / presto-localcsv
View on GitHub
a presto plugin supporting read csv files in local filesystem.
☆10Jul 27, 2018Updated 7 years ago
t54-labs / x402-secure
View on GitHub
Risk layer for the x402 protocol.
☆28Jul 18, 2026Updated last week
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
0xWelt / VibeRL
View on GitHub
VibeRL is a Reinforcement Learning framework built essentially through vibe coding with Kimi K2.
☆17Updated this week
nineinchnick / trino-git
View on GitHub
A Trino connector to access git repository contents
☆17Feb 9, 2026Updated 5 months ago
g588928812 / qlora
View on GitHub
QLoRA: Efficient Finetuning of Quantized LLMs
☆11Jul 22, 2023Updated 3 years ago
UbiquitousLearning / Mandheling-DSP-Training
View on GitHub
The open-source project for "Mandheling: Mixed-Precision On-Device DNN Training with DSP Offloading"[MobiCom'2022]
☆20Aug 4, 2022Updated 3 years ago
TheToughCrane / nano-kvllm
View on GitHub
This project aims to provide a high effective KV cache manage framework for llm inference and improve memory utilization and inference sp…
☆69Apr 24, 2026Updated 3 months ago
citusdata / membership-manager
View on GitHub
Docker image for managing Citus membership via docker-py
☆22Aug 12, 2020Updated 5 years ago
ValeevGroup / libintx
View on GitHub
☆25Nov 5, 2025Updated 8 months ago
linlu-qiu / lm-inductive-reasoning
View on GitHub
☆34Nov 21, 2023Updated 2 years ago
timescale / timescaledb-wale
View on GitHub
Dockerized WAL-E with an HTTP API
☆21Nov 5, 2018Updated 7 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
JoshVarty / SelfSupervisedLearning
View on GitHub
Experiments with self-supervised learning
☆11Mar 9, 2020Updated 6 years ago
bentoml / BentoLMDeploy
View on GitHub
Self-host LLMs with LMDeploy and BentoML
☆22Jul 14, 2026Updated last week
hvanhovell / weld-java
View on GitHub
JVM integration for Weld
☆16Sep 24, 2018Updated 7 years ago
qipengwang / Melon
View on GitHub
MobiSys#114
☆23Aug 17, 2023Updated 2 years ago
romanorac / romanorac.github.io
View on GitHub
My blog
☆11Nov 17, 2020Updated 5 years ago
mrbodoia / alpaca-financial-machine-learning-pipeline
View on GitHub
☆10Sep 15, 2020Updated 5 years ago
zhliu0106 / learning-to-refuse
View on GitHub
Official Implementation of "Learning to Refuse: Towards Mitigating Privacy Risks in LLMs"
☆10Dec 13, 2024Updated last year