Azure / The-LLM-Latency-Guidebook-Optimizing-Response-Times-for-GenAI-Applications

There are many articles that cover the principles of reducing latency optimization for LLMs, however it is often unclear how to actually implement these principles. This repository provides practical techniques for reducing the latency of GenAI applications.

☆27

Alternatives and similar repositories for The-LLM-Latency-Guidebook-Optimizing-Response-Times-for-GenAI-Applications:

Users that are interested in The-LLM-Latency-Guidebook-Optimizing-Response-Times-for-GenAI-Applications are comparing it to the libraries listed below

microsoft / intelligence-toolkit
Interactive workflows for creating AI intelligence reports from real-world data sources
☆73Updated last month
monuminu / AOAI_Samples
☆92Updated last month
microsoft / rag-e2e-sample
☆28Updated 11 months ago
Azure-Samples / llama-index-python
This sample shows how to quickly get started with LlamaIndex.ai on Azure🚀
☆56Updated last month
microsoft / promptflow-rag-project-template
An end-to-end sample of RAG showcasing development, evaluation, experimentation, and deployment using Promptflow, search products like Co…
☆51Updated 7 months ago
farzad528 / azure-ai-search-python-playground
Some python code samples using Azure AI Search for Generative AI stuff
☆57Updated 3 months ago
microsoft / multi-modal-customer-service-agent
Multi-modal & multi-domain customer service agent with real time text, voice and soon video
☆53Updated last week
Azure-Samples / langfuse-on-azure
An easy way to deploy the Langfuse observability platform to Azure Container Apps with Entra authentication.
☆50Updated 4 months ago
microsoft / synthetic-rag-index
Service to import data from various sources and index it in AI Search. Increases data relevance and reduces final size by 90%+. Useful fo…
☆28Updated 6 months ago
Azure-Samples / multimodal-rag-code-execution
A multimodal Retrieval Augmented Generation with code execution capabilities. Process multiple complex documents with images, table, char…
☆53Updated last month
Azure / gpt-rag-orchestrator
☆57Updated 2 months ago
microsoft / llmops-workshop
Learn how to build solutions with Large Language Models.
☆146Updated 7 months ago
Azure / gpt-rag-ingestion
☆103Updated 2 weeks ago
Azure-Samples / dream-team
This repo helps you to build a team of AI agents with Autogen
☆178Updated last week
microsoft / AgenticCookBook
The “Agentic Cookbook for Generative AI Agent usage” is a comprehensive guide designed to empower users with the knowledge and tools to e…
☆107Updated last month
microsoft / Multi-Agent-Custom-Automation-Engine-Solution-Accelerator
The Multi-Agent Custom Automation Engine Solution Accelerator is an AI-driven orchestration system that manages a group of AI agents to a…
☆181Updated this week
microsoft / azure-streamlit-chatbot
Example for Deploying Chatbot using Streamlit and Azure Web App
☆48Updated last year
microsoft / genaiops-promptflow-template
GenAIOps with Prompt Flow is a "GenAIOps template and guidance" to help you build LLM-infused apps using Prompt Flow. It offers a range o…
☆317Updated 2 weeks ago
microsoft / rag-experiment-accelerator
The RAG Experiment Accelerator is a versatile tool designed to expedite and facilitate the process of conducting experiments and evaluati…
☆244Updated 2 weeks ago
Azure / gpt-rag-agentic
☆79Updated 2 weeks ago
Azure-Samples / azureai-assistant-tool
The Azure AI Assistant Tool is experimental Python application and middleware designed to simplify the development, experimentation, test…
☆136Updated last month
victordibia / multiagent-systems-with-autogen
Building LLM-Enabled Multi Agent Applications with AutoGen
☆118Updated 2 weeks ago
Azure-Samples / graphrag-legalcases-postgres
Legal Research Copilot Example Solution built with Generative AI capabilities of PostgreSQL on Azure
☆69Updated last month
farzad528 / mcp-server-azure-ai-agents
Model Context Protocol Servers for Azure AI Search
☆37Updated last month
Azure-Samples / raft-distillation-recipe
A recipe that will walk you through using either Meta Llama 3.1 405B or GPT-4o deployed on Azure AI to generate a synthetic dataset using…
☆59Updated 2 months ago
Anaig / OpenAI-and-Cognitive-Search
Azure OpenAI integration as a custom skillset in Azure Cognitive Search
☆33Updated 2 years ago
olimoz / AI_Teams_AutoGen
AutoGen data analysis
☆32Updated last year
timoklimmer / powerproxy-aoai
Monitors and processes traffic to and from Azure OpenAI endpoints.
☆105Updated this week
Azure / gpt-rag-frontend
☆53Updated last month
msamylea / autogen_focus_group
Virtual focus group with custom personas, product details, and final analysis created with AutoGen, Ollama/Llama3, and Streamlit.
☆45Updated 9 months ago