Azure / The-LLM-Latency-Guidebook-Optimizing-Response-Times-for-GenAI-ApplicationsLinks
There are many articles that cover the principles of reducing latency optimization for LLMs, however it is often unclear how to actually implement these principles. This repository provides practical techniques for reducing the latency of GenAI applications.
☆31Updated last year
Alternatives and similar repositories for The-LLM-Latency-Guidebook-Optimizing-Response-Times-for-GenAI-Applications
Users that are interested in The-LLM-Latency-Guidebook-Optimizing-Response-Times-for-GenAI-Applications are comparing it to the libraries listed below
Sorting:
- This sample shows how to quickly get started with LlamaIndex.ai on Azure🚀☆60Updated 3 months ago
- Guide for designing adaptive, scalable, and secure enterprise multi-agent systems☆119Updated 2 weeks ago
- ☆55Updated 4 months ago
- ☆28Updated last year
- Interactive workflows for creating AI intelligence reports from real-world data sources☆94Updated last week
- ☆29Updated last year
- An end-to-end sample of RAG showcasing development, evaluation, experimentation, and deployment using Promptflow, search products like Co…☆54Updated last year
- The GPT-RAG Data Ingestion service automates processing of diverse documents—PDFs, images, spreadsheets, transcripts, and SharePoint—read…☆154Updated last month
- This solution converts speech to text and then processes and summarizes the text based on the prompt scenario.☆37Updated last year
- Example for Deploying Chatbot using Streamlit and Azure Web App☆52Updated 2 years ago
- An easy way to deploy the Langfuse observability platform to Azure Container Apps with Entra authentication.☆57Updated 3 months ago
- A multimodal Retrieval Augmented Generation with code execution capabilities. Process multiple complex documents with images, table, char…☆77Updated last month
- The GPT-RAG Orchestrator service is an agentic orchestration layer built on Azure AI Foundry Agent Service and the Semantic Kernel framew…☆70Updated 2 weeks ago
- GenAIOps with Prompt Flow is a "GenAIOps template and guidance" to help you build LLM-infused apps using Prompt Flow. It offers a range o…☆343Updated 6 months ago
- Using LlamaIndex with Ray for productionizing LLM applications☆71Updated 2 years ago
- A recipe that will walk you through using either Meta Llama 3.1 405B or OpenAI GPT-4o deployed on Azure AI to generate a synthetic datase…☆75Updated 4 months ago
- This repo helps you to build a team of AI agents with Autogen☆227Updated 3 weeks ago
- Indexing framework designed for the automated creation of structured knowledge bases in Azure AI Search☆14Updated 5 months ago
- A curated list of 🌌 Azure OpenAI, 🦙 Large Language Models (RAG, Agent), and references.☆385Updated this week
- ☆124Updated last month
- Virtual focus group with custom personas, product details, and final analysis created with AutoGen, Ollama/Llama3, and Streamlit.☆47Updated last year
- Learn to build and customize multi-agent systems using the AutoGen. The course teaches you to implement complex AI applications through a…☆122Updated last year
- Automating Advanced Business Analytics with ChatGPT☆65Updated 2 years ago
- Learn how to build solutions with Large Language Models.☆159Updated last year
- ☆108Updated 4 months ago
- The RAG Experiment Accelerator is a versatile tool designed to expedite and facilitate the process of conducting experiments and evaluati…☆286Updated 7 months ago
- Some python code samples using Azure AI Search for Generative AI stuff☆65Updated 9 months ago
- Quickstart sample for using the Azure AI Studio with the SDK or CLI options - and the LangChain framework.☆60Updated last year
- Legal Research Copilot Example Solution built with Generative AI capabilities of PostgreSQL on Azure☆94Updated 8 months ago
- ☆75Updated last year