AI-Maker-Space / FastAPI-LLM-Model-ServingLinks
How to quickly serve an LLM using Fast API, Celery, and Redis
β16Updated last year
Alternatives and similar repositories for FastAPI-LLM-Model-Serving
Users that are interested in FastAPI-LLM-Model-Serving are comparing it to the libraries listed below
Sorting:
- Fine-tune an LLM to perform batch inference and online serving.β112Updated 2 months ago
- Find the optimal model serving solution for π€ Hugging Face models πβ43Updated 3 weeks ago
- GenAIOps on Kubernetes: A collection of reference architectures for running GenAI at scale on Kubernetes using OSS toolingβ132Updated 9 months ago
- π» Decoding ML articles hub: Hands-on articles with code on production-grade MLβ137Updated 5 months ago
- Sample notebooks and prompts for LLM evaluationβ138Updated 2 months ago
- A repository for all ZenML projects that are specific production use-cases.β269Updated 2 weeks ago
- Using LlamaIndex with Ray for productionizing LLM applicationsβ71Updated 2 years ago
- GenAI Experimentationβ57Updated 3 weeks ago
- [ACL'25] Official Code for LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMsβ313Updated last month
- A set of scripts and notebooks on LLM finetunning and dataset creationβ110Updated 10 months ago
- Complete implementation of Llama2 with/without KV cache & inference πβ48Updated last year
- π Use NVIDIA NIMs with Haystack pipelinesβ32Updated 11 months ago
- A collection of all available inference solutions for the LLMsβ91Updated 5 months ago
- A collection of fine-tuning notebooks!β27Updated last year
- Self-host LLMs with vLLM and BentoMLβ140Updated last week
- A collection of hand on notebook for LLMs practitionerβ49Updated 7 months ago
- This repository will contain the presentation and python jupyter notebooks for the DataHack Summit 2024 conference talk, Improving Real-wβ¦β118Updated 10 months ago
- Notebooks and Code about Generative Ai, LLMs, MLOPS, NLP , CV and Graph databasesβ122Updated last week
- A Hands-on Practical Guide to LlamaIndexβ33Updated 9 months ago
- Various installation guides for Large Language Modelsβ72Updated 3 months ago
- Build Enterprise RAG (Retriver Augmented Generation) Pipelines to tackle various Generative AI use cases with LLM's by simply plugging coβ¦β113Updated last year
- A template to kick-start your Python project β¨πβ52Updated 3 weeks ago
- Examples of RAG using Llamaindex with local LLMs - Gemma, Mixtral 8x7B, Llama 2, Mistral 7B, Orca 2, Phi-2, Neural 7Bβ129Updated last year
- This playlab encompasses a multitude of projects crafted through the utilization of Large Language Models, showcasing the versatility andβ¦β124Updated last week
- πΉοΈ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.β137Updated last year
- Mistral + Haystack: build RAG pipelines that rock π€β105Updated last year
- Low latency, High Accuracy, Custom Query routers for Humans and Agents. Built by Prithivi Daβ113Updated 4 months ago
- β20Updated last year
- Sales Conversion Optimization MLOps: Boost revenue with AI-powered insights. Features H2O AutoML, ZenML pipelines, Neptune.ai tracking, dβ¦β17Updated 4 months ago
- Building your first LLM application with OpenAI, and AI-assisted Development, step-by-step!β101Updated 2 months ago