πΉοΈ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.
β139Jul 25, 2024Updated last year
Alternatives and similar repositories for benchmarks
Users that are interested in benchmarks are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Machine Learning Serving focused on GenAI with simplicity as the top priority.β59Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMsβ266Dec 4, 2025Updated 4 months ago
- an auto-sleeping and -waking framework around llama.cppβ12Feb 8, 2025Updated last year
- Proxy server for triton gRPC server that inferences embedding model in Rustβ21Aug 10, 2024Updated last year
- Python library for automatic training, optimization and comparison of Transformer models on most NLP tasks.β20May 6, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- [ICLRW'26] EoRA: Fine-tuning-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximationβ33Mar 24, 2026Updated 2 weeks ago
- Learning and rediscovering ML from total scratchβ12Aug 30, 2021Updated 4 years ago
- Modified Beam Search with periodical restartβ12Sep 12, 2024Updated last year
- B-Llama3o a llama3 with Vision Audio and Audio understanding as well as text and Audio and Animation Data output.β26Jun 3, 2024Updated last year
- This repository contains the metadata and data of different databases that we use for testingβ14Jan 29, 2025Updated last year
- Implementation of various Machine learning and MLOps applications/tutorials used within my Medium blog.β11Jan 28, 2023Updated 3 years ago
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.β37Oct 9, 2025Updated 6 months ago
- Triton implementation of GPT/LLAMAβ21Aug 28, 2024Updated last year
- When real time Yoga Position classification meets GNNβ11Sep 17, 2023Updated 2 years ago
- DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- WebAISum is a Python script that allows you to summarize web pages using AI models. It supports both local models like Ollama and remote β¦β15Apr 28, 2024Updated last year
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafteβ¦β84Oct 29, 2024Updated last year
- Writing Blog Posts with Generative Feedback Loops!β50Mar 19, 2024Updated 2 years ago
- Iterate fast on your RAG pipelinesβ24Jun 21, 2025Updated 9 months ago
- a lightweight, open-source blueprint for building powerful and scalable LLM chat applicationsβ28Jun 7, 2024Updated last year
- Efficient, scalable and enterprise-grade CPU/GPU inference server for π€ Hugging Face transformer models πβ1,688Oct 23, 2024Updated last year
- 3x Faster Inference; Unofficial implementation of EAGLE Speculative Decodingβ82Jul 3, 2025Updated 9 months ago
- Demo of an "always-on" AI assistant.β24Feb 14, 2024Updated 2 years ago
- An integration of Qdrant ANN vector database backend with Haystackβ45Updated this week
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalabiliβ¦β3,997Updated this week
- Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIAβs TensorRT-LLM for GPU aβ¦β42Sep 26, 2024Updated last year
- β120Mar 18, 2026Updated 3 weeks ago
- Notes from our NLP reading club!β18Jul 17, 2021Updated 4 years ago
- A lightweight evaluation suite tailored specifically for assessing Indic LLMs across a diverse range of tasksβ39Jun 10, 2024Updated last year
- Running Microsoft's BitNet inference framework via FastAPI, Uvicorn and Docker.β38Jul 2, 2025Updated 9 months ago
- Sales Conversion Optimization MLOps: Boost revenue with AI-powered insights. Features H2O AutoML, ZenML pipelines, Neptune.ai tracking, dβ¦β21Mar 22, 2025Updated last year
- A guidance compatibility layer for llama-cpp-pythonβ36Sep 11, 2023Updated 2 years ago
- REBUS: A Robust Evaluation Benchmark of Understanding Symbolsβ13Aug 13, 2024Updated last year
- Proton VPN Special Offer - Get 70% off β’ AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- This project showcases engaging interactions between two AI chatbots.β10Jan 10, 2024Updated 2 years ago
- AI_Powered_Dev_Search_Engineβ12Mar 10, 2024Updated 2 years ago
- A Multi-Session and Multi-Therapy Benchmark for High-Realism AI Psychological Counselorβ35Jan 13, 2026Updated 2 months ago
- This repository contains the source code for running llamaindex tutorials from https://howaibuildthis.substack.com/β41Jan 7, 2024Updated 2 years ago
- An ONNX converter script focused on embedding modelsβ33Jan 14, 2025Updated last year
- MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.β2,107Jun 30, 2025Updated 9 months ago
- 33B Chinese LLM, DPO QLORA, 100K context, AirLLM 70B inference with single 4GB GPUβ13May 5, 2024Updated last year