shubham0204 / llama.cpp-simple-chat-interface
Build a simple CMD chat interface with llama.cpp and C++
☆9Updated 2 months ago
Alternatives and similar repositories for llama.cpp-simple-chat-interface
Users that are interested in llama.cpp-simple-chat-interface are comparing it to the libraries listed below
Sorting:
- instinct.cpp provides ready to use alternatives to OpenAI Assistant API and built-in utilities for developing AI Agent applications (RAG,…☆47Updated 10 months ago
- GGML implementation of BERT model with Python bindings and quantization.☆56Updated last year
- A low latency, fault tolerant API for accessing LLM's written in C++ using llama.cpp.☆10Updated last month
- Inference Llama/Llama2/Llama3 Modes in NumPy☆20Updated last year
- ggml implementation of embedding models including SentenceTransformer and BGE☆57Updated last year
- Inference slice of marian for bergamot's tiny11 models. Faster to compile, and wield. Fewer model-archs than bergamot-translator.☆11Updated 6 months ago
- CI for ggml and related projects☆28Updated this week
- A chat UI for Llama.cpp☆13Updated this week
- General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). …☆46Updated 2 months ago
- Experiments with BitNet inference on CPU☆54Updated last year
- Web browser version of StarCoder.cpp☆45Updated last year
- GGML implementation of BERT model with Python bindings and quantization.☆26Updated last year
- Rust crate for some audio utilities☆23Updated 2 months ago
- Extract structured data from local or remote LLM models☆42Updated 10 months ago
- Recording models☆13Updated last year
- Inference RWKV v5, v6 and v7 with Qualcomm AI Engine Direct SDK☆64Updated this week
- Lightweight Llama 3 8B Inference Engine in CUDA C☆47Updated last month
- A sleek, customizable interface for managing LLMs with responsive design and easy agent personalization.☆15Updated 8 months ago
- Inference deployment of the llama3☆11Updated last year
- cortex.llamacpp is a high-efficiency C++ inference engine for edge computing. It is a dynamic library that can be loaded by any server a…☆40Updated this week
- Simple, Fast, Parallel Huggingface GGML model downloader written in python☆24Updated last year
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆26Updated 6 months ago
- UnitEval is a benchmarking and evaluation tools for AutoDev Coder.☆12Updated last year
- Aana SDK is a powerful framework for building AI enabled multimodal applications.☆47Updated this week
- An AI Vision Language Model System for extracting structured knowledge graph information(JSON) from images of process diagrams☆20Updated last month
- Inference Vision Transformer (ViT) in plain C/C++ with ggml☆30Updated last year
- KAN (Kolmogorov–Arnold Networks) in the MLX framework for Apple Silicon☆16Updated last week
- Inference Llama 2 in one file of pure Cuda☆17Updated last year
- ☆10Updated 10 months ago
- A minimalist implementation of the ViT (Vision Transformer) model, using tinygrad☆12Updated 8 months ago