Self-host LLMs with LMDeploy and BentoML
☆22Dec 26, 2025Updated 4 months ago
Alternatives and similar repositories for BentoLMDeploy
Users that are interested in BentoLMDeploy are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆21Jun 9, 2025Updated 10 months ago
- Stateful LLM Serving☆99Mar 11, 2025Updated last year
- ⚡Harry Potter books and audiobooks☆12Oct 1, 2020Updated 5 years ago
- Experiments Notebook of "Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism"☆15Apr 30, 2025Updated last year
- Paper-reading notes for Berkeley OS prelim exam.☆14Aug 28, 2024Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆16Feb 22, 2025Updated last year
- Prompt templates for language models☆10Apr 7, 2026Updated 3 weeks ago
- [EMNLP'25 Industry] Repo for "Z1: Efficient Test-time Scaling with Code"☆68Apr 11, 2025Updated last year
- Text Classification Dataset for Turkish Language☆10Nov 16, 2021Updated 4 years ago
- a presto plugin supporting read csv files in local filesystem.☆10Jul 27, 2018Updated 7 years ago
- Simple image compression/decompression algorithm using DWT (discrete wavelet transform) and RLE+Huffman encoding.☆11Nov 3, 2013Updated 12 years ago
- Through this project we have comprehensively evaluated 10 workload predictors and determined which predictor works the best for Alibaba C…☆12Dec 5, 2018Updated 7 years ago
- Official website of the book: http://themlbook.com/☆13Feb 10, 2019Updated 7 years ago
- how to build a sentence embedding application using BentoML☆15Mar 31, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Source code for ASRU 2019 paper "Adapting Pretrained Transformer to Lattices for Spoken Language Understanding"☆10Jul 8, 2020Updated 5 years ago
- API serving for your diffusers models☆11Jan 19, 2024Updated 2 years ago
- zero shot NER fine tuning☆14Mar 17, 2025Updated last year
- Final training script from HuggingFace Whisper Fine tuning event - to get best results on finetuned model.☆12Dec 24, 2022Updated 3 years ago
- Sentence Embedding as a Service☆15Jun 30, 2025Updated 10 months ago
- ☆24Updated this week
- Demonstrate Function Calling code portability across 4 AI Models: OpenAI, AzureOpenAI, VertexAI Gemini and Mistral AI.☆13Jun 7, 2024Updated last year
- Interface Design for Self-Supervised Speech Models, Accepted to Interspeech2024☆16Nov 19, 2024Updated last year
- Docker image for managing Citus membership via docker-py☆22Aug 12, 2020Updated 5 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Ruby on Rails template engine that allows for multiple formats being laid out in a single specification.☆13Jan 28, 2013Updated 13 years ago
- An Educational Framework Based on PyTorch for Deep Learning Education and Exploration☆11Dec 24, 2023Updated 2 years ago
- ☆11May 18, 2025Updated 11 months ago
- Agent Skills for Mojo and MAX development☆79Apr 17, 2026Updated 2 weeks ago
- ☆11Mar 4, 2026Updated last month
- SV-Sim: Scalable PGAS-based State Vector Simulation of Quantum Circuits☆22Feb 2, 2024Updated 2 years ago
- EuroSys '24: "Trinity: A Fast Compressed Multi-attribute Data Store"☆18Mar 8, 2025Updated last year
- A small utility module to make it simple to build BentoML Services into images inside Kubernetes clusters.☆10Dec 15, 2020Updated 5 years ago
- Fast model deployment on AWS EC2☆14Feb 25, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Interpreting Learned Search and Planning: Reverse-engineering recurrent convolutional networks (DRC) that play Sokoban☆19Jun 29, 2025Updated 10 months ago
- A fast parallel implementation of RNN Transducer.☆12Apr 8, 2025Updated last year
- Resources related to EMNLP 2021 paper "FAME: Feature-Based Adversarial Meta-Embeddings for Robust Input Representations"☆13Dec 14, 2021Updated 4 years ago
- Simplified recipes for preparing commonly used speech datasets, and a PyTorch-compatible Python data loader that can perform standard fea…☆16Jun 12, 2023Updated 2 years ago
- Source code for ACL 2020 paper "Learning Spoken Language Representations with Neural Lattice Language Modeling"☆17Feb 11, 2023Updated 3 years ago
- ☆13Jul 5, 2023Updated 2 years ago
- Marketplace ML experiment - training without backprop☆27Sep 9, 2025Updated 7 months ago