AI-Maker-Space / FastAPI-LLM-Model-ServingLinks
How to quickly serve an LLM using Fast API, Celery, and Redis
☆16Updated 2 years ago
Alternatives and similar repositories for FastAPI-LLM-Model-Serving
Users that are interested in FastAPI-LLM-Model-Serving are comparing it to the libraries listed below
Sorting:
- Fine-tune an LLM to perform batch inference and online serving.☆113Updated 5 months ago
- GenAIOps on Kubernetes: A collection of reference architectures for running GenAI at scale on Kubernetes using OSS tooling☆134Updated last year
- [ACL'25] Official Code for LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMs☆314Updated 3 months ago
- 💻 Decoding ML articles hub: Hands-on articles with code on production-grade ML☆139Updated 8 months ago
- 🚀 Use NVIDIA NIMs with Haystack pipelines☆31Updated last year
- Find the optimal model serving solution for 🤗 Hugging Face models 🚀☆44Updated 3 months ago
- Complete example of how to build an Agentic RAG architecture with Redis, Amazon Bedrock, and LlamaIndex.☆100Updated 11 months ago
- GenAI Experimentation☆58Updated 2 months ago
- Notebooks and Code about Generative Ai, LLMs, MLOPS, NLP , CV and Graph databases☆126Updated this week
- Build Enterprise RAG (Retriver Augmented Generation) Pipelines to tackle various Generative AI use cases with LLM's by simply plugging co…☆115Updated last year
- A collection of all available inference solutions for the LLMs☆92Updated 8 months ago
- Sales Conversion Optimization MLOps: Boost revenue with AI-powered insights. Features H2O AutoML, ZenML pipelines, Neptune.ai tracking, d…☆20Updated 7 months ago
- Building Private Healthcare AI Assistant for Clinics Using Qdrant Hybrid Cloud, DSPy and Groq - Llama3☆22Updated last year
- 🤖 AI Assistant fine-tuned to provide support for coding and design questions based on the latest trends in the industry.☆17Updated last year
- Sample notebooks and prompts for LLM evaluation☆153Updated last week
- Using LlamaIndex with Ray for productionizing LLM applications☆71Updated 2 years ago
- Optimized Large Language Models for Financial Applications – Efficient, Scalable, and Domain-Specific AI for Finance.☆51Updated 4 months ago
- NLP/LLM Mlops Pipeline to dev/train/evaluation, scalable deploy and monitoring systems.☆22Updated last year
- A collection of fine-tuning notebooks!☆29Updated 2 years ago
- A collection of hand on notebook for LLMs practitioner☆50Updated 9 months ago
- Miscellaneous codes and writings for MLOps☆15Updated last month
- Structured pruning and bias visualization for Large Language Models. Tools for LLM optimization and fairness analysis.☆23Updated 2 weeks ago
- Code Repository for Blog - How to Productionize Large Language Models (LLMs)☆12Updated last year
- All code related to medium articles☆19Updated this week
- Self-host LLMs with vLLM and BentoML☆154Updated 2 weeks ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆50Updated last year
- ☆146Updated last year
- Large Language Model (LLM) Inference API and Chatbot☆126Updated last year
- A set of scripts and notebooks on LLM finetunning and dataset creation☆111Updated last year
- Various installation guides for Large Language Models☆76Updated 6 months ago