Azure / The-LLM-Latency-Guidebook-Optimizing-Response-Times-for-GenAI-ApplicationsLinks
There are many articles that cover the principles of reducing latency optimization for LLMs, however it is often unclear how to actually implement these principles. This repository provides practical techniques for reducing the latency of GenAI applications.
☆34Updated last year
Alternatives and similar repositories for The-LLM-Latency-Guidebook-Optimizing-Response-Times-for-GenAI-Applications
Users that are interested in The-LLM-Latency-Guidebook-Optimizing-Response-Times-for-GenAI-Applications are comparing it to the libraries listed below
Sorting:
- Guide for designing adaptive, scalable, and secure enterprise multi-agent systems☆147Updated last month
- An end-to-end sample of RAG showcasing development, evaluation, experimentation, and deployment using Promptflow, search products like Co…☆55Updated last year
- This hands-on walks you through fine-tuning an open source LLM on Azure and serving the fine-tuned model on Azure. It is intended for Dat…☆59Updated 10 months ago
- ☆28Updated last year
- ☆30Updated last year
- The GPT-RAG Data Ingestion service automates processing of diverse documents—PDFs, images, spreadsheets, transcripts, and SharePoint—read…☆159Updated last week
- Learn how to build solutions with Large Language Models.☆160Updated last year
- This sample shows how to quickly get started with LlamaIndex.ai on Azure🚀☆61Updated 5 months ago
- Some python code samples using Azure AI Search for Generative AI stuff☆67Updated 11 months ago
- ☆114Updated last week
- Build secure LangChain applications on Azure☆111Updated last month
- Responses API on Azure OpenAI samples☆55Updated this week
- The GPT-RAG Orchestrator service is an agentic orchestration layer built on Azure AI Foundry Agent Service and the Semantic Kernel framew…☆73Updated last week
- ☆55Updated 6 months ago
- This solution converts speech to text and then processes and summarizes the text based on the prompt scenario.☆38Updated last year
- Building LLM-Enabled Multi Agent Applications from Scratch☆328Updated 2 weeks ago
- A multimodal Retrieval Augmented Generation with code execution capabilities. Process multiple complex documents with images, table, char…☆79Updated this week
- Automatically generate github documentation with readthedocs using your openai endpoint☆38Updated 2 years ago
- Indexing framework designed for the automated creation of structured knowledge bases in Azure AI Search☆14Updated 7 months ago
- Example for Deploying Chatbot using Streamlit and Azure Web App☆53Updated 2 years ago
- Showcase Azure platform’s machine learning capability to recognize document type, extract required fields and push data to downstream app…☆23Updated 2 years ago
- Automating Advanced Business Analytics with ChatGPT☆66Updated 2 years ago
- ☆125Updated 3 weeks ago
- An easy way to deploy the Langfuse observability platform to Azure Container Apps with Entra authentication.☆58Updated 5 months ago
- Model Context Protocol Servers for Azure AI Search☆52Updated 9 months ago
- Assistant API to chat with tabular data and perform analytics in natural language.☆54Updated last year
- This repository offers a Python framework for a retrieval-augmented generation (RAG) pipeline using text and images from MHTML documents,…☆34Updated 2 months ago
- Automated Retrieval and GPT Understanding System by utilizing Azure Document Intelligence in combination with GPT models.☆122Updated last week
- An index of all of our weekly concepts + code events for aspiring AI Engineers and Business Leaders!!☆95Updated this week
- ☆111Updated 2 months ago