premAI-io / benchmarksLinks
πΉοΈ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.
β138Updated last year
Alternatives and similar repositories for benchmarks
Users that are interested in benchmarks are comparing it to the libraries listed below
Sorting:
- experiments with inference on llamaβ103Updated last year
- β198Updated last year
- β140Updated 5 months ago
- The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching oβ¦β155Updated 6 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMsβ267Updated last month
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for freeβ232Updated last year
- Manage scalable open LLM inference endpoints in Slurm clustersβ279Updated last year
- Vector Database with support for late interaction and token level embeddings.β54Updated 7 months ago
- β210Updated 7 months ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.β32Updated 4 months ago
- Machine Learning Serving focused on GenAI with simplicity as the top priority.β59Updated 3 weeks ago
- Convenient wrapper for fine-tuning and inference of Large Language Models (LLMs) with several quantization techniques (GTPQ, bitsandbytesβ¦β146Updated 2 years ago
- A stable, fast and easy-to-use inference library with a focus on a sync-to-async APIβ47Updated last year
- Toolkit for attaching, training, saving and loading of new heads for transformer modelsβ294Updated 10 months ago
- FineTune LLMs in few lines of code (Text2Text, Text2Speech, Speech2Text)β246Updated 2 years ago
- Efficient vector database for hundred millions of embeddings.β211Updated last year
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language Mβ¦β249Updated last year
- β67Updated 10 months ago
- Lightweight demos for finetuning LLMs. Powered by π€ transformers and open-source datasets.β77Updated last year
- A Lightweight Library for AI Observabilityβ255Updated 11 months ago
- Multi-threaded matrix multiplication and cosine similarity calculations for dense and sparse matrices. Appropriate for calculating the K β¦β86Updated last year
- Simple UI for debugging correlations of text embeddingsβ305Updated 8 months ago
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on taskβ¦β184Updated last year
- A set of scripts and notebooks on LLM finetunning and dataset creationβ115Updated last year
- Let's build better datasets, together!β269Updated last year
- β161Updated last year
- C++ inference wrappers for running blazing fast embedding services on your favourite serverless like AWS Lambda. By Prithivi Da, PRs welcβ¦β23Updated last year
- Low-Rank adapter extraction for fine-tuned transformers modelsβ180Updated last year
- Comparison of Language Model Inference Enginesβ239Updated last year
- Fine-tune an LLM to perform batch inference and online serving.β117Updated 8 months ago