πΉοΈ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.
β141Jul 25, 2024Updated last year
Alternatives and similar repositories for benchmarks
Users that are interested in benchmarks are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Machine Learning Serving focused on GenAI with simplicity as the top priority.β60Apr 6, 2026Updated 2 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMsβ268Dec 4, 2025Updated 6 months ago
- Proxy server for triton gRPC server that inferences embedding model in Rustβ21Aug 10, 2024Updated last year
- Python library for automatic training, optimization and comparison of Transformer models on most NLP tasks.β20May 6, 2023Updated 3 years ago
- Learning and rediscovering ML from total scratchβ12Aug 30, 2021Updated 4 years ago
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Modified Beam Search with periodical restartβ12Sep 12, 2024Updated last year
- Exploring limitations of LLM-as-a-judgeβ20Aug 17, 2024Updated last year
- B-Llama3o a llama3 with Vision Audio and Audio understanding as well as text and Audio and Animation Data output.β26Jun 3, 2024Updated 2 years ago
- This repository contains the metadata and data of different databases that we use for testingβ14Jan 29, 2025Updated last year
- Implementation of various Machine learning and MLOps applications/tutorials used within my Medium blog.β11Jan 28, 2023Updated 3 years ago
- A minimal MySQL proxy implements by Rust.β14Jun 18, 2022Updated 4 years ago
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.β37Oct 9, 2025Updated 8 months ago
- When real time Yoga Position classification meets GNNβ11Sep 17, 2023Updated 2 years ago
- WebAISum is a Python script that allows you to summarize web pages using AI models. It supports both local models like Ollama and remote β¦β15Apr 28, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafteβ¦β86Oct 29, 2024Updated last year
- LLM-driven automated knowledge graph construction from text using DSPy and Neo4jβ20Aug 19, 2024Updated last year
- an auto-sleeping and -waking framework around llama.cppβ13Feb 8, 2025Updated last year
- Triton implementation of GPT/LLAMAβ22Aug 28, 2024Updated last year
- Writing Blog Posts with Generative Feedback Loops!β52Mar 19, 2024Updated 2 years ago
- Iterate fast on your RAG pipelinesβ24Jun 21, 2025Updated last year
- Efficient, scalable and enterprise-grade CPU/GPU inference server for π€ Hugging Face transformer models πβ1,689Oct 23, 2024Updated last year
- 3x Faster Inference; Unofficial implementation of EAGLE Speculative Decodingβ84Jul 3, 2025Updated 11 months ago
- Demo of an "always-on" AI assistant.β24Feb 14, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- An integration of Qdrant ANN vector database backend with Haystackβ46Jun 23, 2026Updated last week
- β120Mar 18, 2026Updated 3 months ago
- Notes from our NLP reading club!β19Jul 17, 2021Updated 4 years ago
- LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalabiliβ¦β4,141Updated this week
- A lightweight evaluation suite tailored specifically for assessing Indic LLMs across a diverse range of tasksβ40Jun 10, 2024Updated 2 years ago
- Running Microsoft's BitNet inference framework via FastAPI, Uvicorn and Docker.β39Jul 2, 2025Updated 11 months ago
- Sales Conversion Optimization MLOps: Boost revenue with AI-powered insights. Features H2O AutoML, ZenML pipelines, Neptune.ai tracking, dβ¦β21Mar 22, 2025Updated last year
- REBUS: A Robust Evaluation Benchmark of Understanding Symbolsβ13Aug 13, 2024Updated last year
- AI_Powered_Dev_Search_Engineβ12Mar 10, 2024Updated 2 years ago
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- An ONNX converter script focused on embedding modelsβ33Jan 14, 2025Updated last year
- MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.β2,107Jun 30, 2025Updated last year
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing β‘β70Nov 17, 2025Updated 7 months ago
- Rust crate for some audio utilitiesβ29Jun 17, 2026Updated 2 weeks ago
- Attend - to what matters.β17Feb 22, 2025Updated last year
- OpenAI compatible API for TensorRT LLM triton backendβ221Aug 1, 2024Updated last year
- Use `outlines` generators with Haystack.β14Jun 22, 2026Updated last week