ubermenchh / mini-vllmLinks
☆16Updated last month
Alternatives and similar repositories for mini-vllm
Users that are interested in mini-vllm are comparing it to the libraries listed below
Sorting:
- Implements a LLM similar to Meta's Llama 2 from the ground up in PyTorch, for educational purposes.☆38Updated last year
- Some microbenchmarks and design docs before commencement☆12Updated 5 years ago
- Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)☆28Updated 2 years ago
- Gemma2(9B), Llama3-8B-Finetune-and-RAG, code base for sample, implemented in Kaggle platform☆22Updated last year
- A collection of reproducible inference engine benchmarks☆38Updated 9 months ago
- Manages vllm-nccl dependency☆17Updated last year
- ☆45Updated 9 months ago
- Trace LLM calls (and others) and visualize them in WandB, as interactive SVG or using a streaming local webapp☆14Updated 11 months ago
- Building Recommender System with the Two-Tower Architecture☆17Updated 4 years ago
- Live evaluation of trading agents☆95Updated 2 months ago
- A collection of lightweight interpretability scripts to understand how LLMs think☆89Updated 2 weeks ago
- ☆63Updated last year
- Fast reinforcement learning 💨☆28Updated 6 months ago
- Official implementation for paper "How Far Are We from Genuinely Useful Deep Research Agents?"☆63Updated 2 months ago
- Verifiers for LLM Reinforcement Learning☆80Updated 9 months ago
- Implemented a script that automatically adjusts Qwen3's inference and non-inference capabilities, based on an OpenAI-like API. The infere…☆22Updated 9 months ago
- Simple repository for training small reasoning models☆49Updated last year
- [SIGIR 2024 (Demo)] CoSearchAgent: A Lightweight Collborative Search Agent with Large Language Models☆30Updated last year
- ☆47Updated 9 months ago
- Implementation of an LLM prompting pipeline combined with wrappers for auto-decomposing reasoning steps and for search through the reason…☆15Updated last year
- Code repo for MathAgent☆19Updated 2 years ago
- Official Project Page for HLA: Higher-order Linear Attention (https://arxiv.org/abs/2510.27258)☆44Updated last month
- Benchmarking Optimizers for LLM Pretraining☆49Updated last month
- Supervised instruction finetuning for LLM with HF trainer and Deepspeed☆36Updated 2 years ago
- RLHF experiments on a single A100 40G GPU. Support PPO, GRPO, REINFORCE, RAFT, RLOO, ReMax, DeepSeek R1-Zero reproducing.☆79Updated 11 months ago
- a curated list of the role of small models in the LLM era☆111Updated last year
- Learning PyTorch through the D2L book. A series of notebooks for the same☆28Updated 3 years ago
- ☆96Updated 2 weeks ago
- ☆42Updated 9 months ago
- [ICLR 2025] DSBench: How Far are Data Science Agents from Becoming Data Science Experts?☆103Updated 5 months ago