premAI-io / benchmarksLinks
πΉοΈ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.
β137Updated 10 months ago
Alternatives and similar repositories for benchmarks
Users that are interested in benchmarks are comparing it to the libraries listed below
Sorting:
- experiments with inference on llamaβ104Updated 11 months ago
- The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching oβ¦β134Updated last week
- β121Updated last month
- β198Updated last year
- Manage scalable open LLM inference endpoints in Slurm clustersβ257Updated 10 months ago
- Vector Database with support for late interaction and token level embeddings.β54Updated 8 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMsβ263Updated 7 months ago
- β210Updated 10 months ago
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for freeβ230Updated 7 months ago
- Inference server benchmarking toolβ67Updated last month
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language Mβ¦β220Updated 7 months ago
- Generalist and Lightweight Model for Text Classificationβ128Updated last week
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.β34Updated 3 weeks ago
- XTR/WARP (SIGIR'25) is an extremely fast and accurate retrieval engine based on Stanford's ColBERTv2/PLAID and Google DeepMind's XTR.β130Updated 3 weeks ago
- Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created byβ¦β31Updated 9 months ago
- β130Updated 9 months ago
- OpenAI compatible API for TensorRT LLM triton backendβ207Updated 10 months ago
- Baguetter is a flexible, efficient, and hackable search engine library implemented in Python. It's designed for quickly benchmarking, impβ¦β179Updated 9 months ago
- β76Updated last year
- β53Updated last year
- Late Interaction Models Training & Retrievalβ395Updated this week
- This repository contains the code for dataset curation and finetuning of instruct variant of the Bilingual OpenHathi model. The resultinβ¦β23Updated last year
- β60Updated 2 months ago
- Set of scripts to finetune LLMsβ37Updated last year
- Fine-tune an LLM to perform batch inference and online serving.β111Updated 3 weeks ago
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"β153Updated 7 months ago
- β260Updated 2 weeks ago
- This is our own implementation of 'Layer Selective Rank Reduction'β238Updated last year
- β143Updated 10 months ago
- Simple UI for debugging correlations of text embeddingsβ180Updated this week