There are many articles that cover the principles of reducing latency optimization for LLMs, however it is often unclear how to actually implement these principles. This repository provides practical techniques for reducing the latency of GenAI applications.
☆34May 6, 2024Updated last year
Alternatives and similar repositories for The-LLM-Latency-Guidebook-Optimizing-Response-Times-for-GenAI-Applications
Users that are interested in The-LLM-Latency-Guidebook-Optimizing-Response-Times-for-GenAI-Applications are comparing it to the libraries listed below
Sorting:
- Tui Utility to test REST APIs☆13Nov 20, 2023Updated 2 years ago
- AI-Sentry: A lightweight, pluggable facade layer for Azure Open AI, addressing common cross-cutting concerns for enterprise-wide scaling.☆17Aug 4, 2025Updated 7 months ago
- Convert any image into a Region Adjacency Graph (RAG)☆12Apr 27, 2020Updated 5 years ago
- Library to convert natural language utterance into a structured domain specific language☆18Feb 11, 2026Updated 3 weeks ago
- This solution converts speech to text and then processes and summarizes the text based on the prompt scenario.☆19Aug 8, 2024Updated last year
- Solution Accelerator: Using Logic Apps & Form Recognizer☆15Sep 22, 2023Updated 2 years ago
- ☆15Mar 6, 2024Updated 2 years ago
- Creates an Azure AI Studio hub, project and required dependent resources including Azure Open AI Service, Cognitive Search and more.☆32Oct 2, 2024Updated last year
- Assistant API to chat with tabular data and perform analytics in natural language.☆56Aug 30, 2024Updated last year
- Official repo for the NCR Crypto Meetup☆17Jun 1, 2022Updated 3 years ago
- ☆30Feb 14, 2025Updated last year
- 🤖 UI for gpt-all-star: https://github.com/kyaukyuai/gpt-all-star☆28Feb 12, 2026Updated 3 weeks ago
- Contoso Outdoors Company web application shown at Microsoft Ignite☆58May 10, 2024Updated last year
- This Python project integrates MetaTrader5 with GPT-4 to generate automated trading signals. It analyzes OHLC and tick data to provide re…☆12Aug 25, 2024Updated last year
- Data Structures with Python(AIX20001) 강의 자료실☆18Jun 14, 2024Updated last year
- ☆15Aug 30, 2021Updated 4 years ago
- ☆30Apr 8, 2022Updated 3 years ago
- The AI Video Intelligence Solution Accelerator enables developers to deploy an end-to-end IoT Edge, including Azure Data Box Edge, based …☆27Oct 4, 2023Updated 2 years ago
- Create an MCP Server for your API using the TypeSpec MCP Server☆45Feb 4, 2026Updated last month
- ☆14Updated this week
- An experiment to see if we can process G2 reviews to extract topics from reviews☆10Feb 5, 2024Updated 2 years ago
- ☆30Aug 2, 2024Updated last year
- A simple Sentiment Analysis API in FastAPI.☆14Dec 17, 2024Updated last year
- A Terminal User Interface (TUI) application that enables interactive conversations with your documents using Large Language Models (LLM) …☆13Dec 11, 2024Updated last year
- This solution converts speech to text and then processes and summarizes the text based on the prompt scenario.☆39Oct 8, 2024Updated last year
- AI Coworker that lives in slack☆36Jun 7, 2024Updated last year
- Your personal free AI RA. Demo⬇️☆34Sep 13, 2023Updated 2 years ago
- ☆36Nov 15, 2024Updated last year
- Code for MS Ignite 2023 | Live Breakout Session | BRK203☆32Dec 5, 2023Updated 2 years ago
- Generative AI Ops RAG project template☆40Mar 11, 2025Updated 11 months ago
- ☆12Jul 5, 2020Updated 5 years ago
- ☆11Sep 1, 2025Updated 6 months ago
- ☆17Jan 23, 2026Updated last month
- A full-stack AI-powered business intelligence tool for non-experts, featuring serverless backend processing and a secure Streamlit fronte…☆28Feb 13, 2026Updated 3 weeks ago
- Learn How To Observe, Manage, and Scale, Agentic AI Apps Using Azure AI Foundry - with this hands-on workshop☆39Feb 5, 2026Updated last month
- 🔒 Reference MCP servers that demo how authentication works with the current Model Context Protocol spec.☆51Jan 7, 2026Updated last month
- Validate best practices in your project using this tool☆12May 9, 2019Updated 6 years ago
- Configure Internal iOS Settings, like SpringBoard, Carrier Settings, Mobile Asset Settings.☆10Mar 6, 2019Updated 7 years ago
- Application of Blockchain in Crop Farming and Crop Supply☆10May 15, 2018Updated 7 years ago