gotzmann / booster
Booster - open accelerator for LLM models. Better inference and debugging for AI hackers
☆154Updated 8 months ago
Alternatives and similar repositories for booster:
Users that are interested in booster are comparing it to the libraries listed below
- Binding to transformers in ggml☆60Updated 2 weeks ago
- ☆16Updated 11 months ago
- 4 bits quantization of SantaCoder using GPTQ☆51Updated last year
- A simple vector database: Text encoding, semantic search, document storage☆89Updated last year
- Llama 2 inference in one file of pure Go☆104Updated last year
- An endpoint server for efficiently serving quantized open-source LLMs for code.☆54Updated last year
- A guidance compatibility layer for llama-cpp-python☆34Updated last year
- A go wrapper around the rwkv.cpp library☆20Updated last year
- Port of Facebook's LLaMA (Large Language Model Meta AI) in Golang with embedded C/C++☆167Updated last year
- ☆55Updated last year
- RightHand - A GPT4 powered assistive tool.☆109Updated 3 months ago
- ☆38Updated last year
- GPT-2 small trained on phi-like data☆66Updated last year
- Extracts structured data from unstructured input. Programming language agnostic. Uses llama.cpp☆45Updated 11 months ago
- Deploy your GGML models to HuggingFace Spaces with Docker and gradio☆36Updated last year
- Inference Llama 2 in Go☆39Updated last year
- A simple GUI utility for gathering LIMA-like chat data.☆23Updated last month
- TypeScript generator for llama.cpp Grammar directly from TypeScript interfaces☆135Updated 9 months ago
- Falcon LLM ggml framework with CPU and GPU support☆246Updated last year
- ☆31Updated last year
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆76Updated 4 months ago
- Inference Llama 2 in one file of pure go☆16Updated last year
- 🐦 A open blazing-fast simple model gateway for rapid development of production GenAI apps☆144Updated 8 months ago
- Local LLaMAs/Models in VSCode☆53Updated last year
- Go client for txtai☆78Updated this week
- Web UI for working with large language models☆32Updated 10 months ago
- A fast batching API to serve LLM models☆182Updated 11 months ago
- Extend the original llama.cpp repo to support redpajama model.☆117Updated 7 months ago
- ☆24Updated 2 months ago
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.