AI-Maker-Space / FastAPI-LLM-Model-Serving
How to quickly serve an LLM using Fast API, Celery, and Redis
☆15Updated last year
Alternatives and similar repositories for FastAPI-LLM-Model-Serving:
Users that are interested in FastAPI-LLM-Model-Serving are comparing it to the libraries listed below
- Fine-tune an LLM to perform batch inference and online serving.☆109Updated last week
- Build Enterprise RAG (Retriver Augmented Generation) Pipelines to tackle various Generative AI use cases with LLM's by simply plugging co…☆109Updated 9 months ago
- GenAI Experimentation☆58Updated this week
- Optimized Large Language Models for Financial Applications – Efficient, Scalable, and Domain-Specific AI for Finance.☆46Updated 3 weeks ago
- A collection of fine-tuning notebooks!☆27Updated last year
- Sample notebooks and prompts for LLM evaluation☆124Updated last week
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 9 months ago
- Running load tests on a FastAPI application using Locust☆15Updated last month
- A set of scripts and notebooks on LLM finetunning and dataset creation☆106Updated 7 months ago
- Language Model for Mainframe Modernization☆51Updated 8 months ago
- A collection of hand on notebook for LLMs practitioner☆47Updated 3 months ago
- ☆143Updated 9 months ago
- Starter pack for NeurIPS LLM Efficiency Challenge 2023.☆124Updated last year
- Various installation guides for Large Language Models☆69Updated this week
- GenAIOps on Kubernetes: A collection of reference architectures for running GenAI at scale on Kubernetes using OSS tooling☆129Updated 5 months ago
- Find the optimal model serving solution for 🤗 Hugging Face models 🚀☆43Updated last year
- This playlab encompasses a multitude of projects crafted through the utilization of Large Language Models, showcasing the versatility and…☆118Updated last week
- ☆29Updated last year
- 💻 Decoding ML articles hub: Hands-on articles with code on production-grade ML☆129Updated last month
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…☆213Updated 5 months ago
- Set of scripts to finetune LLMs☆37Updated last year
- RAGs: Simple implementations of Retrieval Augmented Generation (RAG) Systems☆100Updated 3 months ago
- ☆18Updated 4 months ago
- Using LlamaIndex with Ray for productionizing LLM applications☆71Updated last year
- Example code and notebooks related to mlflow, llmops, etc.☆42Updated 9 months ago
- This project involves using llamaindex Multi Agents concierge system and Qdrant vector database to customize the RAG application with use…☆50Updated 8 months ago
- Low latency, High Accuracy, Custom Query routers for Humans and Agents. Built by Prithivi Da☆102Updated 3 weeks ago
- Document Q&A on Wikipedia articles using LLMs☆75Updated last year
- Unlock the potential of finetuning Large Language Models (LLMs). Learn from industry expert, and discover when to apply finetuning, data …☆56Updated last year
- ☆34Updated last month