shubham0204 / llama.cpp-simple-chat-interfaceLinks

Build a simple CMD chat interface with llama.cpp and C++

☆9

Alternatives and similar repositories for llama.cpp-simple-chat-interface

Users that are interested in llama.cpp-simple-chat-interface are comparing it to the libraries listed below

Sorting:

RobinQu / instinct.cpp
instinct.cpp provides ready to use alternatives to OpenAI Assistant API and built-in utilities for developing AI Agent applications (RAG,…
☆49Updated 11 months ago
iamlemec / bert.cpp
GGML implementation of BERT model with Python bindings and quantization.
☆55Updated last year
nomic-ai / kompute
General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). …
☆48Updated 3 months ago
thansen0 / fastllm.cpp
A low latency, fault tolerant API for accessing LLM's written in C++ using llama.cpp.
☆10Updated 2 months ago
wangzhaode / mnn-tts
mnn tts demo.
☆16Updated 3 weeks ago
MollySophia / rwkv-qualcomm
Inference RWKV v5, v6 and v7 with Qualcomm AI Engine Direct SDK
☆68Updated last week
MaggotHATE / Llama_chat
A chat UI for Llama.cpp
☆13Updated this week
ngxson / ggml-easy
Thin wrapper around GGML to make life easier
☆34Updated this week
EdVince / model_zoo
Recording models
☆13Updated last year
rupeshs / shortgpt
Ask shortgpt for instant and concise answers
☆13Updated 2 years ago
wangkuiyi / huggingface-tokenizer-in-cxx
☆66Updated 2 years ago
xyzhang626 / embeddings.cpp
ggml implementation of embedding models including SentenceTransformer and BGE
☆58Updated last year
jquesnelle / ctranslate2-rs
Rust bindings for CTranslate2
☆14Updated last year
rlggyp / YOLOv10-OpenVINO-CPP-Inference
YOLOv10 C++ implementation using OpenVINO for efficient and accurate real-time object detection.
☆69Updated 2 months ago
hscspring / llama.np
Inference Llama/Llama2/Llama3 Modes in NumPy
☆21Updated last year
kyutai-labs / kaudio
Rust crate for some audio utilities
☆23Updated 2 months ago
marty1885 / paroli
Streaming TTS based on Piper with optional RK3588 NPU support
☆89Updated last month
menloresearch / cortex.llamacpp
cortex.llamacpp is a high-efficiency C++ inference engine for edge computing. It is a dynamic library that can be loaded by any server a…
☆40Updated this week
jerinphilip / slimt
Inference slice of marian for bergamot's tiny11 models. Faster to compile, and wield. Fewer model-archs than bergamot-translator.
☆11Updated 7 months ago
TextGeneratorio / text-generator.io
Run Vision LLMs, TTS and STT APIs. Website and API for https://text-generator.io
☆35Updated this week
inisis / OnnxSlim
A Toolkit to Help Optimize Onnx Model
☆153Updated this week
xenova / model-explorer
Browse, search, and visualize ONNX models.
☆30Updated last month
guoguo1314 / llama3_learn.c
Inference deployment of the llama3
☆11Updated last year
samuel-vitorino / lm.rs-webui
Light WebUI for lm.rs
☆23Updated 7 months ago
kyutai-labs / moshi-webrtc
Proof of concept for running moshi/hibiki using webrtc
☆19Updated 3 months ago
Zackriya-Solutions / diagram2graph
An AI Vision Language Model System for extracting structured knowledge graph information(JSON) from images of process diagrams
☆20Updated 2 months ago
mmwillet / TTS.cpp
TTS support with GGML
☆43Updated this week
openvinotoolkit / mlas
☆11Updated 4 months ago
cwhy / rwkv-decon
Trying to deconstruct RWKV in understandable terms
☆14Updated 2 years ago
daquexian / faster-rwkv
☆123Updated last year