Azure / The-LLM-Latency-Guidebook-Optimizing-Response-Times-for-GenAI-ApplicationsLinks
There are many articles that cover the principles of reducing latency optimization for LLMs, however it is often unclear how to actually implement these principles. This repository provides practical techniques for reducing the latency of GenAI applications.
☆30Updated last year
Alternatives and similar repositories for The-LLM-Latency-Guidebook-Optimizing-Response-Times-for-GenAI-Applications
Users that are interested in The-LLM-Latency-Guidebook-Optimizing-Response-Times-for-GenAI-Applications are comparing it to the libraries listed below
Sorting:
- This sample shows how to quickly get started with LlamaIndex.ai on Azure🚀☆60Updated 2 months ago
- Building LLM-Enabled Multi Agent Applications from Scratch☆188Updated this week
- An easy way to deploy the Langfuse observability platform to Azure Container Apps with Entra authentication.☆57Updated 2 months ago
- Automatically generate github documentation with readthedocs using your openai endpoint☆37Updated 2 years ago
- An end-to-end sample of RAG showcasing development, evaluation, experimentation, and deployment using Promptflow, search products like Co…☆54Updated last year
- The GPT-RAG Data Ingestion service automates processing of diverse documents—PDFs, images, spreadsheets, transcripts, and SharePoint—read…☆137Updated 3 weeks ago
- ☆123Updated last month
- Building your first LLM application with OpenAI, and AI-assisted Development, step-by-step!☆109Updated last week
- ☆50Updated 4 months ago
- Interactive workflows for creating AI intelligence reports from real-world data sources☆87Updated this week
- ☆76Updated last year
- A mixture of Gen AI cookbook recipes for Gen AI applications.☆220Updated last year
- Docs, Snippets, Guides☆75Updated this week
- GenAIOps with Prompt Flow is a "GenAIOps template and guidance" to help you build LLM-infused apps using Prompt Flow. It offers a range o…☆342Updated 5 months ago
- ☆37Updated 7 months ago
- An index of all of our weekly concepts + code events for aspiring AI Engineers and Business Leaders!!☆87Updated this week
- ☆53Updated 11 months ago
- Service to import data from various sources and index it in AI Search. Increases data relevance and reduces final size by 90%+. Useful fo…☆33Updated 11 months ago
- A recipe that will walk you through using either Meta Llama 3.1 405B or OpenAI GPT-4o deployed on Azure AI to generate a synthetic datase…☆72Updated 2 months ago
- The RAG Experiment Accelerator is a versatile tool designed to expedite and facilitate the process of conducting experiments and evaluati…☆272Updated 5 months ago
- This hands-on walks you through fine-tuning an open source LLM on Azure and serving the fine-tuned model on Azure. It is intended for Dat…☆54Updated 6 months ago
- The “Agentic Cookbook for Generative AI Agent usage” is a comprehensive guide designed to empower users with the knowledge and tools to e…☆133Updated 6 months ago
- The GPT-RAG Orchestrator service is an agentic orchestration layer built on Azure AI Foundry Agent Service and the Semantic Kernel framew…☆65Updated 3 weeks ago
- Some python code samples using Azure AI Search for Generative AI stuff☆66Updated 8 months ago
- A backend for a chat application written in Python FastAPI framework☆64Updated this week
- ☆28Updated last year
- This solution converts speech to text and then processes and summarizes the text based on the prompt scenario.☆36Updated 11 months ago
- A multimodal Retrieval Augmented Generation with code execution capabilities. Process multiple complex documents with images, table, char…☆69Updated 2 months ago
- ☆29Updated last year
- Hugging Face Deep Learning Containers (DLCs) for Google Cloud☆154Updated 5 months ago