iangitonga / tinyllama.cpp
A C++ implementation of tinyllama inference on CPU.
☆11Updated last year
Alternatives and similar repositories for tinyllama.cpp
Users that are interested in tinyllama.cpp are comparing it to the libraries listed below
Sorting:
- Training a reward model for RLHF using RWKV.☆14Updated last year
- Trying to deconstruct RWKV in understandable terms☆14Updated 2 years ago
- instinct.cpp provides ready to use alternatives to OpenAI Assistant API and built-in utilities for developing AI Agent applications (RAG,…☆48Updated 10 months ago
- Yet Another (LLM) Web UI, made with Gemini☆12Updated 4 months ago
- cortex.llamacpp is a high-efficiency C++ inference engine for edge computing. It is a dynamic library that can be loaded by any server a…☆40Updated this week
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆29Updated this week
- Compare openresty vs nginx + PUC_lua☆16Updated last year
- A converter and basic tester for rwkv onnx☆42Updated last year
- ☆19Updated 3 months ago
- Inference Llama/Llama2/Llama3 Modes in NumPy☆20Updated last year
- Simple, Fast, Parallel Huggingface GGML model downloader written in python☆24Updated last year
- A chat UI for Llama.cpp☆13Updated last week
- A low latency, fault tolerant API for accessing LLM's written in C++ using llama.cpp.☆10Updated last month
- 33B Chinese LLM, DPO QLORA, 100K context, AirLLM 70B inference with single 4GB GPU☆13Updated last year
- RWKV models and examples powered by candle.☆18Updated 2 months ago
- The hearth of The Pulsar App, fast, secure and shared inference with modern UI☆56Updated 5 months ago
- ☆18Updated 7 months ago
- ggml implementation of BERT Embedding☆25Updated last year
- This project demonstrates the computation process of the RWKV (Receptance Weighted Key Value) model through Excel spreadsheets.☆14Updated 2 weeks ago
- Light WebUI for lm.rs☆23Updated 7 months ago
- Course Project for COMP4471 on RWKV☆17Updated last year
- ☆19Updated last month
- JAX implementations of RWKV☆19Updated last year
- Tensor library for machine learning☆21Updated last year
- A combination of Oobabooga's fork and the main cuda branch of GPTQ-for-LLaMa in a package format.☆22Updated last year
- Nexusflow function call, tool use, and agent benchmarks.☆19Updated 5 months ago
- Run ONNX RWKV-v4 models with GPU acceleration using DirectML [Windows], or just on CPU [Windows AND Linux]; Limited to 430M model at this…☆20Updated 2 years ago
- AirLLM 70B inference with single 4GB GPU☆12Updated 9 months ago
- Train your own small bitnet model☆70Updated 6 months ago
- Build HTML artefacts with Ollama☆11Updated 5 months ago