premAI-io / benchmarksLinks

🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.

☆137

Alternatives and similar repositories for benchmarks

Users that are interested in benchmarks are comparing it to the libraries listed below

Sorting:

hamelsmu / llama-inference
experiments with inference on llama
☆104Updated last year
Preemo-Inc / text-generation-inference
☆199Updated last year
mixedbread-ai / batched
The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching o…
☆142Updated 2 weeks ago
QuixiAI / spectrum
☆128Updated 3 months ago
Locutusque / TPU-Alignment
Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free
☆232Updated 9 months ago
cohere-ai / DiskVectorIndex
☆210Updated last month
bentoml / BentoVLLM
Self-host LLMs with vLLM and BentoML
☆138Updated last week
IlyasMoutawwakil / py-txi
A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.
☆33Updated 2 months ago
neuralmagic / nm-vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
☆265Updated 9 months ago
aniketmaurya / fastserve-ai
Machine Learning Serving focused on GenAI with simplicity as the top priority.
☆59Updated 3 weeks ago
huggingface / data-is-better-together
Let's build better datasets, together!
☆260Updated 7 months ago
huggingface / llm-swarm
Manage scalable open LLM inference endpoints in Slurm clusters
☆268Updated last year
deshwalmahesh / PHUDGE
Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…
☆49Updated last year
center-for-humans-and-machines / transformer-heads
Toolkit for attaching, training, saving and loading of new heads for transformer models
☆284Updated 4 months ago
daniel-furman / sft-demos
Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.
☆77Updated 9 months ago
PrithivirajDamodaran / blitz-embed
C++ inference wrappers for running blazing fast embedding services on your favourite serverless like AWS Lambda. By Prithivi Da, PRs welc…
☆22Updated last year
arcee-ai / EvolKit
EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…
☆229Updated 9 months ago
jina-ai / correlations
Simple UI for debugging correlations of text embeddings
☆288Updated 2 months ago
huggingface / competitions
☆124Updated 9 months ago
thomasgauthier / LoRD
Low-Rank adapter extraction for fine-tuned transformers models
☆175Updated last year
argilla-io / notus
Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first app…
☆168Updated last year
anyscale / e2e-llm-workflows
Fine-tune an LLM to perform batch inference and online serving.
☆112Updated 2 months ago
cohere-ai / BinaryVectorDB
Efficient vector database for hundred millions of embeddings.
☆207Updated last year
DeployQL / LintDB
Vector Database with support for late interaction and token level embeddings.
☆55Updated last month
cfahlgren1 / observers
A Lightweight Library for AI Observability
☆249Updated 5 months ago
promptslab / LLMtuner
FineTune LLMs in few lines of code (Text2Text, Text2Speech, Speech2Text)
☆240Updated last year
huggingface / optimum-tpu
Google TPU optimizations for transformers models
☆117Updated 6 months ago
michaelfeil / embed
A stable, fast and easy-to-use inference library with a focus on a sync-to-async API
☆45Updated 10 months ago
deep-diver / llamaduo
[ACL'25] Official Code for LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMs
☆313Updated 3 weeks ago
titanml / takeoff-community
TitanML Takeoff Server is an optimization, compression and deployment platform that makes state of the art machine learning models access…
☆114Updated last year