Azure / The-LLM-Latency-Guidebook-Optimizing-Response-Times-for-GenAI-Applications
There are many articles that cover the principles of reducing latency optimization for LLMs, however it is often unclear how to actually implement these principles. This repository provides practical techniques for reducing the latency of GenAI applications.
☆23Updated 9 months ago
Alternatives and similar repositories for The-LLM-Latency-Guidebook-Optimizing-Response-Times-for-GenAI-Applications:
Users that are interested in The-LLM-Latency-Guidebook-Optimizing-Response-Times-for-GenAI-Applications are comparing it to the libraries listed below
- ☆54Updated 3 weeks ago
- This solution converts speech to text and then processes and summarizes the text based on the prompt scenario.☆30Updated 4 months ago
- ☆26Updated 9 months ago
- An end-to-end sample of RAG showcasing development, evaluation, experimentation, and deployment using Promptflow, search products like Co…☆49Updated 5 months ago
- This repo helps you to build a team of AI agents with Autogen☆128Updated this week
- Legal Research Copilot Example Solution built with Generative AI capabilities of PostgreSQL on Azure☆52Updated last week
- A recipe that will walk you through using either Meta Llama 3.1 405B or GPT-4o deployed on Azure AI to generate a synthetic dataset using…☆45Updated last week
- This sample shows how to quickly get started with LlamaIndex.ai on Azure🚀☆49Updated last month
- A Hands-on Practical Guide to LlamaIndex☆32Updated 4 months ago
- Multi-modal & multi-domain customer service agent with real time text, voice and soon video☆31Updated this week
- The Multi-Agent Custom Automation Engine Solution Accelerator is an AI-driven orchestration system that manages a group of AI agents to a…☆99Updated this week
- This hands-on walks you through fine-tuning an open source LLM on Azure and serving the fine-tuned model on Azure. It is intended for Dat…☆41Updated 3 months ago
- Azure OpenAI integration as a custom skillset in Azure Cognitive Search☆33Updated last year
- ☆39Updated 3 months ago
- Azure OpenAI benchmarking tool☆136Updated 8 months ago
- Semantic Kernel Workshop☆12Updated last year
- ☆52Updated 4 months ago
- The RAG Experiment Accelerator is a versatile tool designed to expedite and facilitate the process of conducting experiments and evaluati…☆229Updated 2 months ago
- ☆56Updated 3 weeks ago
- ☆61Updated 3 weeks ago
- ☆69Updated 10 months ago
- ☆30Updated last month
- ☆76Updated 3 weeks ago
- Interactive workflows for creating AI intelligence reports from real-world data sources☆72Updated this week
- Using Azure OpenAI GPT 4o to extract information such as text, tables and charts from Documents to Markdown☆24Updated 3 weeks ago
- ☆50Updated last week
- This repository offers a Python framework for a retrieval-augmented generation (RAG) pipeline using text and images from MHTML documents,…☆25Updated 2 months ago