ModelsLab / modelqLinks
ModelQ is a lightweight, battle-tested Python library for scheduling and queuing machine learning inference tasks. It's designed as a faster and simpler alternative to Celery for ML workloads, using Redis and threading to efficiently run background tasks.
☆18Updated last week
Alternatives and similar repositories for modelq
Users that are interested in modelq are comparing it to the libraries listed below
Sorting:
- agent-from-scratch is a Python-based repository designed for developers and researchers interested in understanding the inner workings of…☆96Updated last year
- lancedb-myntra-fashion-search☆33Updated last year
- ☆207Updated last year
- A template to kick-start your Python project ✨🚀☆53Updated 6 months ago
- A repository for all ZenML projects that are specific production use-cases.☆302Updated 2 months ago
- Notebooks for fine tuning pali gemma☆117Updated 9 months ago
- Integrating SSE with NVIDIA Triton Inference Server using a Python backend and Zephyr model. There is very less documentation how to use …☆10Updated last year
- A repository for hacking Generative Fill with Open Source Tools☆34Updated last year
- Notebook and Scripts that showcase running quantized diffusion models on consumer GPUs☆38Updated last year
- Just some stuff for Interview questions, books, annotated paper, notes, cheat sheets etc etc related to ML,AI, Deep Learning and Data Sc…☆123Updated 5 months ago
- ☆56Updated last year
- 🚀 Framework for seamless fine-tuning of Whisper model on a multi-lingual dataset and deployment to prod.☆36Updated 11 months ago
- End-to-end recipes for optimizing diffusion models with torchao and diffusers (inference and FP8 training).☆392Updated last month
- 🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.☆138Updated last year
- zero-to-lightning☆31Updated last year
- ☆26Updated 2 years ago
- ☆127Updated 10 months ago
- Low latency, High Accuracy, Custom Query routers for Humans and Agents. Built by Prithivi Da☆119Updated 10 months ago
- FRP Fork☆177Updated 10 months ago
- implement RED metrics in fastapi integrate with Prometheus and Grafana☆40Updated 11 months ago
- Fine-tune an LLM to perform batch inference and online serving.☆120Updated 8 months ago
- How to quickly serve an LLM using Fast API, Celery, and Redis☆16Updated 2 years ago
- This project shows how to serve an ONNX-optimized image classification model as a web service with FastAPI, Docker, and Kubernetes.☆224Updated 3 years ago
- How to serve ML predictions 100x faster☆59Updated last year
- A CLI to estimate inference memory requirements for Hugging Face models, written in Python.☆646Updated last week
- Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 lines☆196Updated last year
- Complete implementation of Llama2 with/without KV cache & inference 🚀☆49Updated last year
- A simple guide to MLOps through ZenML and its various integrations.☆188Updated 2 years ago
- Hands-on hub to learn techniques to optimize and serve AI models to production the most optimal way.☆14Updated 5 months ago
- This playlab encompasses a multitude of projects crafted through the utilization of Large Language Models, showcasing the versatility and…☆145Updated 2 months ago