Learn the ins and outs of efficiently serving Large Language Models (LLMs). Dive into optimization techniques, including KV caching and Low Rank Adapters (LoRA), and gain hands-on experience with Predibase’s LoRAX framework inference server.
☆19Apr 12, 2024Updated last year
Alternatives and similar repositories for Efficiently-Serving-LLMs
Users that are interested in Efficiently-Serving-LLMs are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Human Evaluation Benchmark for Text Simplification☆10Sep 6, 2018Updated 7 years ago
- A monolingual parallel corpus for sentence simplification☆11Jul 4, 2016Updated 9 years ago
- This is the code for neural-Jacana aligner, and the data for MultiMWA dataset.☆20Feb 12, 2023Updated 3 years ago
- A set of tips and tricks to assist in the Certified Kubernetes Application Developer exam by Cloud Native Computing Foundation.☆93Dec 20, 2022Updated 3 years ago
- A Parallel Russian-Simple Russian Dataset☆15Mar 30, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Deploy SageMaker models with Terraform☆23Feb 14, 2018Updated 8 years ago
- Deploy, launch and use LLMs on AWS☆16Jun 2, 2023Updated 2 years ago
- Tutorial for LLM developers about engine design, service deployment, evaluation/benchmark, etc. Provide a C/S style optimized LLM inferen…☆19Sep 5, 2023Updated 2 years ago
- ☆10Aug 18, 2021Updated 4 years ago
- Здесь собирается каталог ссылок на полезные языковые ресурсы башкирского языка☆16Jul 25, 2024Updated last year
- ☆25May 9, 2022Updated 3 years ago
- Concurrent inverse Bloom filter.☆15Feb 3, 2015Updated 11 years ago
- Alignment and annotation for comparable documents.☆22Oct 16, 2018Updated 7 years ago
- ☆10Aug 24, 2023Updated 2 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Code for AINL2018 paper Deep Convolutional Networks for Supervised Morpheme Segmentation of Russian Language☆24Aug 23, 2019Updated 6 years ago
- Machine Learning for Mathematics Faculty (HSE) 2018☆18Jan 23, 2022Updated 4 years ago
- Repo for the Advanced Python Skills course that I created (hosted in Udemy and Skillshare)☆15Nov 1, 2020Updated 5 years ago
- oneNeuron Pytroch basics course docs plus code☆10Mar 14, 2022Updated 4 years ago
- COMET for African languages☆11Jan 24, 2025Updated last year
- ☆27Aug 20, 2018Updated 7 years ago
- ☆14Oct 11, 2023Updated 2 years ago
- Ideas on how to quickly learn to build command-line tools☆11Feb 26, 2022Updated 4 years ago
- A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning☆16Jul 20, 2020Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- LLM Evals Leaderboard☆48Nov 21, 2023Updated 2 years ago
- Super Mario is a legendary game we all cherish! In this project, we will deploy Super Mario on Amazon EKS (Elastic Kubernetes Service) us…☆11Feb 3, 2026Updated last month
- A Python package that helps you create paths relative to the project root☆15Dec 27, 2022Updated 3 years ago
- Repository for React Fundamentals classroom demonstration contacts app☆11Nov 19, 2024Updated last year
- Community detection in complex networks using hybrid quantum annealing on Amazon Braket☆13Jul 6, 2023Updated 2 years ago
- A SSD-based graph processing engine for billion-node graphs☆12Feb 1, 2015Updated 11 years ago
- A Java library for quantum programming using Quil.☆16Jul 23, 2018Updated 7 years ago
- This repository contains publicly available speech and text data in Luganda.☆12Sep 4, 2020Updated 5 years ago
- For my IBM Data Science Professional certificate capstone project in early 2020, I used pandas, the FourSquare API, Folium, and other Pyt…☆13Dec 31, 2020Updated 5 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- You’ll explore new advancements like ChatGPT’s function calling capability, and build a conversational agent using a new syntax called La…☆17Oct 28, 2023Updated 2 years ago
- A compute framework for building Search, RAG, Recommendations and Analytics over complex (structured+unstructured) data, with ultra-modal…☆12Sep 16, 2024Updated last year
- This repository will take you through creating a FastAPI StableDiffusion app (including Dockerfile) all the way to adding a new feature u…☆38Nov 9, 2022Updated 3 years ago
- Russian morphological tagset converters library.☆43Oct 4, 2019Updated 6 years ago
- AI-driving Vehicle Simulation using Machine Learning(CNN) | PyTorch implementation of "End to End Learning for Self-Driving Cars" (arXiv:…☆21Jan 18, 2020Updated 6 years ago
- Packed Memory Array☆17May 14, 2014Updated 11 years ago
- The specification of the LDBC Financial Benchmark☆19Jan 9, 2026Updated 2 months ago