Azure / The-LLM-Latency-Guidebook-Optimizing-Response-Times-for-GenAI-ApplicationsView external linksLinks
There are many articles that cover the principles of reducing latency optimization for LLMs, however it is often unclear how to actually implement these principles. This repository provides practical techniques for reducing the latency of GenAI applications.
☆34May 6, 2024Updated last year
Alternatives and similar repositories for The-LLM-Latency-Guidebook-Optimizing-Response-Times-for-GenAI-Applications
Users that are interested in The-LLM-Latency-Guidebook-Optimizing-Response-Times-for-GenAI-Applications are comparing it to the libraries listed below
Sorting:
- Convert any image into a Region Adjacency Graph (RAG)☆12Apr 27, 2020Updated 5 years ago
- AI-Sentry: A lightweight, pluggable facade layer for Azure Open AI, addressing common cross-cutting concerns for enterprise-wide scaling.☆17Aug 4, 2025Updated 6 months ago
- Tui Utility to test REST APIs☆13Nov 20, 2023Updated 2 years ago
- Library to convert natural language utterance into a structured domain specific language☆19Jan 13, 2026Updated last month
- Solution Accelerator: Using Logic Apps & Form Recognizer☆15Sep 22, 2023Updated 2 years ago
- This solution converts speech to text and then processes and summarizes the text based on the prompt scenario.☆19Aug 8, 2024Updated last year
- ☆15Mar 6, 2024Updated last year
- Creates an Azure AI Studio hub, project and required dependent resources including Azure Open AI Service, Cognitive Search and more.☆31Oct 2, 2024Updated last year
- Assistant API to chat with tabular data and perform analytics in natural language.☆56Aug 30, 2024Updated last year
- Official repo for the NCR Crypto Meetup☆17Jun 1, 2022Updated 3 years ago
- ☆30Feb 14, 2025Updated last year
- Contoso Outdoors Company web application shown at Microsoft Ignite☆58May 10, 2024Updated last year
- ☆15Aug 30, 2021Updated 4 years ago
- This Python project integrates MetaTrader5 with GPT-4 to generate automated trading signals. It analyzes OHLC and tick data to provide re…☆12Aug 25, 2024Updated last year
- 🤖 UI for gpt-all-star: https://github.com/kyaukyuai/gpt-all-star☆28Feb 5, 2026Updated last week
- Data Structures with Python(AIX20001) 강의 자료실☆18Jun 14, 2024Updated last year
- ☆30Apr 8, 2022Updated 3 years ago
- The AI Video Intelligence Solution Accelerator enables developers to deploy an end-to-end IoT Edge, including Azure Data Box Edge, based …☆28Oct 4, 2023Updated 2 years ago
- Create an MCP Server for your API using the TypeSpec MCP Server☆44Feb 4, 2026Updated last week
- ☆30Aug 2, 2024Updated last year
- ☆14Dec 7, 2025Updated 2 months ago
- A Terminal User Interface (TUI) application that enables interactive conversations with your documents using Large Language Models (LLM) …☆13Dec 11, 2024Updated last year
- A simple Sentiment Analysis API in FastAPI.☆15Dec 17, 2024Updated last year
- An experiment to see if we can process G2 reviews to extract topics from reviews☆10Feb 5, 2024Updated 2 years ago
- This solution converts speech to text and then processes and summarizes the text based on the prompt scenario.☆39Oct 8, 2024Updated last year
- Generative AI Ops RAG project template☆38Mar 11, 2025Updated 11 months ago
- AI Coworker that lives in slack☆36Jun 7, 2024Updated last year
- Your personal free AI RA. Demo⬇️☆34Sep 13, 2023Updated 2 years ago
- ☆36Nov 15, 2024Updated last year
- Code for MS Ignite 2023 | Live Breakout Session | BRK203☆32Dec 5, 2023Updated 2 years ago
- Interact with ChatGPT and GPT-4 in alternative ways☆13Mar 17, 2024Updated last year
- ☆12Jul 5, 2020Updated 5 years ago
- This sample shows how to implement a deep researcher with DeepSeek R1☆64May 20, 2025Updated 8 months ago
- ☆16Jan 23, 2026Updated 3 weeks ago
- This project aims to build a traveling recommendation application using Google Places API and OpenAI LLM.☆11Mar 19, 2024Updated last year
- Validate best practices in your project using this tool☆12May 9, 2019Updated 6 years ago
- My talks!☆13Oct 13, 2025Updated 4 months ago
- 🔒 Reference MCP servers that demo how authentication works with the current Model Context Protocol spec.☆47Jan 7, 2026Updated last month
- Configure Internal iOS Settings, like SpringBoard, Carrier Settings, Mobile Asset Settings.☆10Mar 6, 2019Updated 6 years ago