MinhNgyuen / llm-benchmarkLinks
Benchmark llm performance
☆100Updated 11 months ago
Alternatives and similar repositories for llm-benchmark
Users that are interested in llm-benchmark are comparing it to the libraries listed below
Sorting:
- LLM Benchmark for Throughput via Ollama (Local LLMs)☆255Updated 2 weeks ago
- Comparison of the output quality of quantization methods, using Llama 3, transformers, GGUF, EXL2.☆156Updated last year
- A proxy server for multiple ollama instances with Key security☆461Updated last week
- 🚀 Retrieval Augmented Generation (RAG) with txtai. Combine search and LLMs to find insights with your own data.☆385Updated 2 months ago
- Fully-featured, beautiful web interface for vLLM - built with NextJS.☆146Updated 2 months ago
- automatically quant GGUF models☆187Updated this week
- A open webui function for better R1 experience☆79Updated 4 months ago
- ☆204Updated last month
- InferX is a Inference Function as a Service Platform☆116Updated 2 weeks ago
- a Repository of Open-WebUI tools to use with your favourite LLMs☆247Updated this week
- An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.☆257Updated 4 months ago
- Your Trusty Memory-enabled AI Companion - Simple RAG chatbot optimized for local LLMs | 12 Languages Supported | OpenAI API Compatible☆319Updated 4 months ago
- Dataset Crafting w/ RAG/Wikipedia ground truth and Efficient Fine-Tuning Using MLX and Unsloth. Includes configurable dataset annotation …☆185Updated 11 months ago
- A Python-based web-assisted large language model (LLM) search assistant using Llama.cpp☆357Updated 8 months ago
- Export and Backup Ollama models into GGUF and ModelFile☆75Updated 10 months ago
- A fast batching API to serve LLM models☆183Updated last year
- A python package for developing AI applications with local LLMs.☆150Updated 6 months ago
- Code execution utilities for Open WebUI & Ollama☆290Updated 8 months ago
- On-device LLM Inference Powered by X-Bit Quantization☆256Updated last month
- ☆95Updated 6 months ago
- The Fastest Way to Fine-Tune LLMs Locally☆312Updated 3 months ago
- A simple experiment on letting two local LLM have a conversation about anything!☆110Updated last year
- The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM …☆578Updated 5 months ago
- Minimal Linux OS with a Model Context Protocol (MCP) gateway to expose local capabilities to LLMs.☆257Updated 3 weeks ago
- ☆116Updated 8 months ago
- Practical and advanced guide to LLMOps. It provides a solid understanding of large language models’ general concepts, deployment techniqu…☆69Updated 11 months ago
- Docker compose to run vLLM on Windows☆92Updated last year
- The RunPod worker template for serving our large language model endpoints. Powered by vLLM.☆333Updated 3 weeks ago
- This small API downloads and exposes access to NeuML's txtai-wikipedia and full wikipedia datasets, taking in a query and returning full …☆98Updated this week
- An application for running LLMs locally on your device, with your documents, facilitating detailed citations in generated responses.☆602Updated 8 months ago