gpustack / gguf-parser-go
Review/Check GGUF files and estimate the memory usage and maximum tokens per second.
☆127Updated last week
Alternatives and similar repositories for gguf-parser-go:
Users that are interested in gguf-parser-go are comparing it to the libraries listed below
- LM inference server implementation based on *.cpp.☆131Updated this week
- A text-to-speech and speech-to-text server compatible with the OpenAI API, supporting Whisper, FunASR, Bark, and CosyVoice backends.☆70Updated 2 months ago
- automatically quant GGUF models☆160Updated this week
- transparent proxy server for llama.cpp's server to provide automatic model swapping☆421Updated this week
- Comparison of Language Model Inference Engines☆207Updated 2 months ago
- A memory framework for Large Language Models and Agents.☆177Updated 2 months ago
- ☆136Updated 3 weeks ago
- Pure C++ implementation of several models for real-time chatting on your computer (CPU & GPU)☆539Updated this week
- VSCode AI coding assistant powered by self-hosted llama.cpp endpoint.☆180Updated last month
- ☆194Updated last month
- Something similar to Apple Intelligence?☆59Updated 8 months ago
- ggml implementation of embedding models including SentenceTransformer and BGE☆54Updated last year
- Turns devices into a scalable LLM platform☆124Updated this week
- Educational framework exploring ergonomic, lightweight multi-agent orchestration. Modified to use local Ollama endpoint☆49Updated 4 months ago
- Efficient visual programming for AI language models☆347Updated 6 months ago
- Building open version of OpenAI o1 via reasoning traces (Groq, ollama, Anthropic, Gemini, OpenAI, Azure supported) Demo: https://hugging…☆175Updated 4 months ago
- AI for all: Build the large graph of the language models☆263Updated 9 months ago
- Open Source Text Embedding Models with OpenAI Compatible API☆147Updated 8 months ago
- Distributed Inference for mlx LLm☆84Updated 7 months ago
- llama.cpp fork with additional SOTA quants and improved performance☆202Updated this week
- Download models from the Ollama library, without Ollama☆62Updated 4 months ago
- The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM …☆538Updated 3 weeks ago
- ☆59Updated 10 months ago
- The latest graphrag interface is used, using the local ollama to provide the LLM interface.Support for using the pip installation☆142Updated 5 months ago
- 📋 NotebookMLX - An Open Source version of NotebookLM (Ported NotebookLlama)☆258Updated last week
- LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA☆477Updated 2 months ago