MinhNgyuen / llm-benchmark
Benchmark llm performance
β95Updated 8 months ago
Alternatives and similar repositories for llm-benchmark:
Users that are interested in llm-benchmark are comparing it to the libraries listed below
- LLM Benchmark for Throughput via Ollama (Local LLMs)β203Updated last month
- π Retrieval Augmented Generation (RAG) with txtai. Combine search and LLMs to find insights with your own data.β346Updated 3 months ago
- a Repository of Open-WebUI tools to use with your favourite LLMsβ182Updated last week
- A proxy server for multiple ollama instances with Key securityβ373Updated this week
- β197Updated 2 weeks ago
- This is the Mixture-of-Agents (MoA) concept, adapted from the original work by TogetherAI. My version is tailored for local model usage aβ¦β113Updated 9 months ago
- Testing LLM reasoning abilities with family relationship quizzes.β62Updated 2 months ago
- A open webui function for better R1 experienceβ79Updated 3 weeks ago
- A fast batching API to serve LLM modelsβ183Updated 11 months ago
- Comparison of the output quality of quantization methods, using Llama 3, transformers, GGUF, EXL2.β148Updated 10 months ago
- automatically quant GGUF modelsβ164Updated last week
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.β73Updated 3 months ago
- transparent proxy server on demand model swapping for llama.cpp (or any local OpenAPI compatible server)β475Updated this week
- Use Codestral Mamba with Visual Studio Code and the Continue extension. A local LLM alternative to GitHub Copilot.β31Updated 8 months ago
- Serving LLMs in the HF-Transformers format via a PyFlask APIβ71Updated 6 months ago
- CaSIL is an advanced natural language processing system that implements a sophisticated four-layer semantic analysis architecture. It proβ¦β64Updated 4 months ago
- Guaranteed structured outputs from any language model. Eliminate 100% of schema violations and state tracking failures in your LLM applicβ¦β120Updated this week
- The Fastest Way to Fine-Tune LLMs Locallyβ285Updated last week
- A Python-based web-assisted large language model (LLM) search assistant using Llama.cppβ346Updated 5 months ago
- β83Updated 3 months ago
- A simple experiment on letting two local LLM have a conversation about anything!β107Updated 8 months ago
- Gradio based tool to run opensource LLM models directly from Huggingfaceβ91Updated 9 months ago
- β66Updated last month
- GPU Power and Performance Managerβ57Updated 5 months ago
- This project demonstrates a basic chain-of-thought interaction with any LLM (Large Language Model)β318Updated 6 months ago
- Fully-featured, beautiful web interface for vLLM - built with NextJS.β116Updated last week
- Self-host LLMs with vLLM and BentoMLβ97Updated this week
- An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.β243Updated 3 weeks ago
- A command-line personal assistant that integrates with Google Calendar, Gmail, and Tasks to help manage your digital life.β120Updated 4 months ago
- A real-time speech-to-speech chatbot powered by Whisper Small, Llama 3.2, and Kokoro-82M.β196Updated 2 months ago