MinhNgyuen / llm-benchmarkLinks
Benchmark llm performance
☆108Updated last year
Alternatives and similar repositories for llm-benchmark
Users that are interested in llm-benchmark are comparing it to the libraries listed below
Sorting:
- LLM Benchmark for Throughput via Ollama (Local LLMs)☆313Updated 3 months ago
- Comparison of the output quality of quantization methods, using Llama 3, transformers, GGUF, EXL2.☆165Updated last year
- The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM …☆610Updated 9 months ago
- ☆108Updated 3 months ago
- An application for running LLMs locally on your device, with your documents, facilitating detailed citations in generated responses.☆620Updated last year
- An innovative library for efficient LLM inference via low-bit quantization☆350Updated last year
- An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.☆266Updated 9 months ago
- Distributed Inference for mlx LLm☆99Updated last year
- Comparison of Language Model Inference Engines☆237Updated 11 months ago
- llama3.cuda is a pure C/CUDA implementation for Llama 3 model.☆349Updated 7 months ago
- Compare open-source local LLM inference projects by their metrics to assess popularity and activeness.☆682Updated last month
- automatically quant GGUF models☆219Updated last month
- WilmerAI is one of the oldest LLM semantic routers. It uses multi-layer prompt routing and complex workflows to allow you to not only cre…☆791Updated last month
- The RunPod worker template for serving our large language model endpoints. Powered by vLLM.☆386Updated this week
- One click templates for inferencing Language Models☆221Updated 2 weeks ago
- Practical and advanced guide to LLMOps. It provides a solid understanding of large language models’ general concepts, deployment techniqu…☆76Updated last year
- A simple experiment on letting two local LLM have a conversation about anything!☆112Updated last year
- Fast parallel LLM inference for MLX☆234Updated last year
- Your Trusty Memory-enabled AI Companion - Simple RAG chatbot optimized for local LLMs | 12 Languages Supported | OpenAI API Compatible☆344Updated 9 months ago
- A fast batching API to serve LLM models☆189Updated last year
- This small API downloads and exposes access to NeuML's txtai-wikipedia and full wikipedia datasets, taking in a query and returning full …☆101Updated 3 months ago
- This reference can be used with any existing OpenAI integrated apps to run with TRT-LLM inference locally on GeForce GPU on Windows inste…☆126Updated last year
- Fully-featured, beautiful web interface for vLLM - built with NextJS.☆163Updated this week
- ☆209Updated 3 months ago
- This project demonstrates a basic chain-of-thought interaction with any LLM (Large Language Model)☆323Updated last year
- SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework.☆284Updated 5 months ago
- function calling-based LLM agents☆289Updated last year
- A collection of prompts to challenge the reasoning abilities of large language models in presence of misguiding information☆453Updated 4 months ago
- 🚀 Retrieval Augmented Generation (RAG) with txtai. Combine search and LLMs to find insights with your own data.☆421Updated last week
- Setup and run a local LLM and Chatbot using consumer grade hardware.☆298Updated 2 weeks ago