There are many articles that cover the principles of reducing latency optimization for LLMs, however it is often unclear how to actually implement these principles. This repository provides practical techniques for reducing the latency of GenAI applications.
☆37May 6, 2024Updated 2 years ago
Alternatives and similar repositories for The-LLM-Latency-Guidebook-Optimizing-Response-Times-for-GenAI-Applications
Users that are interested in The-LLM-Latency-Guidebook-Optimizing-Response-Times-for-GenAI-Applications are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14Apr 1, 2025Updated last year
- Generative AI Ops RAG project template☆43Apr 21, 2026Updated last month
- State‑of‑the‑art speech recognition model for English, delivering transcription accuracy across diverse audio scenarios. <metadata> gpu: …☆21Apr 16, 2025Updated last year
- hikalium's lifestyle guide☆13Feb 16, 2025Updated last year
- This solution converts speech to text and then processes and summarizes the text based on the prompt scenario.☆20Aug 8, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A Durable Task Python SDK compatible with the Durable Task Scheduler☆30May 15, 2026Updated last week
- Solution Accelerator: Using Logic Apps & Form Recognizer☆15Sep 22, 2023Updated 2 years ago
- A collection of Korean NLP hands-on labs on Amazon SageMaker☆19Dec 20, 2023Updated 2 years ago
- Creates an Azure AI Studio hub, project and required dependent resources including Azure Open AI Service, Cognitive Search and more.☆33Oct 2, 2024Updated last year
- Coffee Chat Voice Assistant is a voice-driven ordering system powered by Azure OpenAI GPT-4o Realtime API, simulating the experience of o…☆31May 4, 2026Updated 3 weeks ago
- Assistant API to chat with tabular data and perform analytics in natural language.☆56Aug 30, 2024Updated last year
- ☆10Dec 27, 2024Updated last year
- Hex Editor Neo Structure Definition File Library☆12Jul 4, 2025Updated 10 months ago
- Azure OpenAI benchmarking tool☆29Apr 4, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Datasets and models included in the book "Introduction to Bayesian Data Analysis for Cognitive Science".☆17Apr 21, 2026Updated last month
- ☆12Dec 19, 2023Updated 2 years ago
- Explore the use of DSPy for extracting features from PDFs 🔎☆52Mar 1, 2024Updated 2 years ago
- Convert any image into a Region Adjacency Graph (RAG)☆12Apr 27, 2020Updated 6 years ago
- ☆15Apr 8, 2026Updated last month
- Library to convert natural language utterance into a structured domain specific language☆19Feb 11, 2026Updated 3 months ago
- ☆15Oct 18, 2024Updated last year
- Machine Learning Projects for the Dr. Andrew Ng's course on Coursera☆12Dec 19, 2017Updated 8 years ago
- 🚀 Embark on your agentic journey !☆29May 28, 2025Updated 11 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- This hands-on lab walks you through a step-by-step approach to efficiently serving and fine-tuning large-scale Korean models on AWS infra…☆26Feb 8, 2024Updated 2 years ago
- A package that can be locally executed to generate minutes in Japanese☆10Sep 11, 2023Updated 2 years ago
- ☆15Mar 6, 2024Updated 2 years ago
- Templates etc. for creating experiments using Ibex Farm.☆11Jul 21, 2018Updated 7 years ago
- The Atlas of Pidgin and Creole Language Structures☆16Nov 9, 2022Updated 3 years ago
- ☆11Nov 20, 2020Updated 5 years ago
- Lab manual for Psyc 3400 @ Brooklyn College☆17Dec 10, 2020Updated 5 years ago
- a Pandoc template for a beamer poster, ideally to be used with RStudio's knitr/pandoc/latex build chain.☆14Nov 25, 2016Updated 9 years ago
- ☆30Feb 14, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Interlinear glosses for pandoc☆10Feb 12, 2018Updated 8 years ago
- AI-Sentry: A lightweight, pluggable facade layer for Azure Open AI, addressing common cross-cutting concerns for enterprise-wide scaling.☆17Aug 4, 2025Updated 9 months ago
- Repo for "An empirically-driven guide on using Bayes Factors for M/EEG decoding"☆13Nov 2, 2022Updated 3 years ago
- Official repo for the NCR Crypto Meetup☆17Jun 1, 2022Updated 3 years ago
- This solution converts speech to text and then processes and summarizes the text based on the prompt scenario.☆39Oct 8, 2024Updated last year
- High-performance open-source orchestration utility that utilizes EBS Direct APIs to efficiently clone, copy and migrate EBS snapshots to …☆39Dec 11, 2024Updated last year
- ☆15Aug 30, 2021Updated 4 years ago