Self-host LLMs with LMDeploy and BentoML
☆22Dec 26, 2025Updated 2 months ago
Alternatives and similar repositories for BentoLMDeploy
Users that are interested in BentoLMDeploy are comparing it to the libraries listed below
Sorting:
- ☆20Jun 9, 2025Updated 9 months ago
- Stateful LLM Serving☆97Mar 11, 2025Updated last year
- An LLM leaderboard for stateful agents☆21Oct 16, 2025Updated 5 months ago
- ☆15Apr 26, 2025Updated 10 months ago
- Experiments Notebook of "Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism"☆15Apr 30, 2025Updated 10 months ago
- Paper-reading notes for Berkeley OS prelim exam.☆14Aug 28, 2024Updated last year
- ☆16Feb 22, 2025Updated last year
- Prompt templates for language models☆10Feb 28, 2026Updated 3 weeks ago
- Langchain + Docker + Neo4j☆10Oct 29, 2024Updated last year
- Text Classification Dataset for Turkish Language☆10Nov 16, 2021Updated 4 years ago
- Through this project we have comprehensively evaluated 10 workload predictors and determined which predictor works the best for Alibaba C…☆12Dec 5, 2018Updated 7 years ago
- ☆13May 12, 2025Updated 10 months ago
- A curated list for Efficient Large Language Models☆11Mar 25, 2024Updated last year
- Elevate your language models with insightful diversity metrics.☆11Feb 4, 2024Updated 2 years ago
- Distributed IO-aware Attention algorithm☆24Sep 24, 2025Updated 5 months ago
- Final training script from HuggingFace Whisper Fine tuning event - to get best results on finetuned model.☆12Dec 24, 2022Updated 3 years ago
- Interpreting Learned Search and Planning: Reverse-engineering recurrent convolutional networks (DRC) that play Sokoban☆17Jun 29, 2025Updated 8 months ago
- A throughput-oriented high-performance serving framework for LLMs☆949Oct 29, 2025Updated 4 months ago
- The driver for LMCache core to run in vLLM☆63Feb 4, 2025Updated last year
- An Educational Framework Based on PyTorch for Deep Learning Education and Exploration☆11Dec 24, 2023Updated 2 years ago
- Simple Telegram bot to annotate and varify automatic speech recognition datasets☆12Mar 30, 2021Updated 4 years ago
- A RAG that can scale 🧑🏻💻☆11May 28, 2024Updated last year
- SGLang is fast serving framework for large language models and vision language models.☆33Nov 24, 2025Updated 3 months ago
- ☆11May 18, 2025Updated 10 months ago
- EuroSys '24: "Trinity: A Fast Compressed Multi-attribute Data Store"☆19Mar 8, 2025Updated last year
- Rethinking the Trust Region in LLM Reinforcement Learning☆45Mar 2, 2026Updated 2 weeks ago
- ☆91Oct 30, 2025Updated 4 months ago
- Details of the datasets for Few-shot class-incremental audio classification☆11Dec 6, 2023Updated 2 years ago
- Resources related to EMNLP 2021 paper "FAME: Feature-Based Adversarial Meta-Embeddings for Robust Input Representations"☆13Dec 14, 2021Updated 4 years ago
- 3D visualization of depth maps which are created by any depth model such as Monodepth, Packnet, etc..☆12Jun 9, 2020Updated 5 years ago
- Source code for ACL 2020 paper "Learning Spoken Language Representations with Neural Lattice Language Modeling"☆17Feb 11, 2023Updated 3 years ago
- Marketplace ML experiment - training without backprop☆27Sep 9, 2025Updated 6 months ago
- An automated data pipeline scaling RL to pretraining levels☆74Oct 11, 2025Updated 5 months ago
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆87Mar 23, 2025Updated 11 months ago
- Implementation of SmoothCache, a project aimed at speeding-up Diffusion Transformer (DiT) based GenAI models with error-guided caching.☆48Jul 17, 2025Updated 8 months ago
- [APSIPA'22] Exploring Speaker Age Estimation on Different Self-Supervised Learning Models☆14Oct 19, 2022Updated 3 years ago
- some stuff about generative ai☆15Feb 20, 2025Updated last year
- Nex Venus Communication Library☆74Nov 17, 2025Updated 4 months ago
- A desktop compatible version of the Defog app☆14Aug 20, 2024Updated last year