jasonacox / TinyLLMLinks

Setup and run a local LLM and Chatbot using consumer grade hardware.

☆312

Alternatives and similar repositories for TinyLLM

Users that are interested in TinyLLM are comparing it to the libraries listed below

Sorting:

Picovoice / picollm
On-device LLM Inference Powered by X-Bit Quantization
☆278Updated 2 weeks ago
bentoml / BentoVLLM
Self-host LLMs with vLLM and BentoML
☆168Updated 3 weeks ago
Rivridis / LLM-Assistant
Locally running LLM with internet access
☆97Updated 7 months ago
SqueezeAILab / TinyAgent
[EMNLP 2024 Demo] TinyAgent: Function Calling at the Edge!
☆467Updated last year
intentee / llmops-handbook
Practical and advanced guide to LLMOps. It provides a solid understanding of large language models’ general concepts, deployment techniqu…
☆79Updated last year
transformerlab / transformerlab-api
API Server for Transformer Lab
☆83Updated 2 months ago
wangcx18 / llm-vscode-inference-server
An endpoint server for efficiently serving quantized open-source LLMs for code.
☆58Updated 2 years ago
mani-kantap / llm-inference-solutions
A collection of all available inference solutions for the LLMs
☆94Updated 11 months ago
BerriAI / liteLLM-proxy
☆185Updated 2 years ago
ibm-granite / granite-3.1-language-models
Granite 3.1 Language Models
☆137Updated 7 months ago
pseudotensor / open-strawberry
Building open version of OpenAI o1 via reasoning traces (Groq, ollama, Anthropic, Gemini, OpenAI, Azure supported) Demo: https://hugging…
☆188Updated last year
runpod-workers / worker-vllm
The RunPod worker template for serving our large language model endpoints. Powered by vLLM.
☆401Updated 2 weeks ago
AkiRusProd / llm-agent
LLM using long-term memory through vector database
☆52Updated last year
nuance1979 / llama-server
LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.
☆130Updated 2 years ago
ibm-granite / granite-3.0-language-models
☆270Updated 7 months ago
Itachi-Uchiha581 / Auto-Data
Auto Data is a library designed for quick and effortless creation of datasets tailored for fine-tuning Large Language Models (LLMs).
☆105Updated last year
ngshya / easyRAG
Build your own RAG and run it locally on your laptop: ColBERT + DSPy + Streamlit
☆60Updated last year
kolenaIO / autoarena
Rank LLMs, RAG systems, and prompts using automated head-to-head evaluation
☆108Updated last year
aidatatools / ollama-benchmark
LLM Benchmark for Throughput via Ollama (Local LLMs)
☆331Updated 3 weeks ago
iohub / collama
VSCode AI coding assistant powered by self-hosted llama.cpp endpoint.
☆183Updated last year
akx / ggify
Tool to download models from Huggingface Hub and convert them to GGML/GGUF for llama.cpp
☆170Updated 9 months ago
Maximilian-Winter / llama-cpp-agent
The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM …
☆615Updated 11 months ago
reid41 / QA-Pilot
QA-Pilot is an interactive chat project that leverages online/local LLM for rapid understanding and navigation of GitHub code repository.
☆318Updated 5 months ago
chigkim / Ollama-MMLU-Pro
☆109Updated 5 months ago
TrelisResearch / one-click-llms
One click templates for inferencing Language Models
☆228Updated 2 months ago
continuedev / deploy-os-code-llm
🌉 How to deploy an open-source code LLM for your dev team
☆112Updated 2 years ago
10Nates / ollama-autocoder
A simple to use Ollama autocompletion engine with options exposed and streaming functionality
☆144Updated 10 months ago
priontific / MLX-text-completion-notebook
A simple Jupyter Notebook for learning MLX text-completion fine-tuning!
☆123Updated last year
Nagi-ovo / CRAG-Ollama-Chat
Corrective RAG demo powerd by Ollama
☆110Updated last year
epolewski / EricLLM
A fast batching API to serve LLM models
☆189Updated last year