AlexBuz / llama-zipLinks
LLM-powered lossless compression tool
☆302Updated last month
Alternatives and similar repositories for llama-zip
Users that are interested in llama-zip are comparing it to the libraries listed below
Sorting:
- Experimental adventure game with AI-generated content☆111Updated 9 months ago
- ☆337Updated 6 months ago
- This is our own implementation of 'Layer Selective Rank Reduction'☆240Updated last year
- LLM-based code completion engine☆190Updated last year
- 1.58-bit LLaMa model☆82Updated last year
- Low-Rank adapter extraction for fine-tuned transformers models☆180Updated last year
- A fast batching API to serve LLM models☆189Updated last year
- Train your own small bitnet model☆77Updated last year
- Falcon LLM ggml framework with CPU and GPU support☆249Updated 2 years ago
- Web UI for ExLlamaV2☆513Updated last year
- Testing LLM reasoning abilities with family relationship quizzes.☆63Updated last year
- Inference of Mamba and Mamba2 models in pure C☆196Updated 2 weeks ago
- ☆135Updated 9 months ago
- Stop messing around with finicky sampling parameters and just use DRµGS!☆360Updated last year
- klmbr - a prompt pre-processing technique to break through the barrier of entropy while generating text with LLMs☆86Updated last year
- TypeScript generator for llama.cpp Grammar directly from TypeScript interfaces☆141Updated last year
- ☆166Updated 6 months ago
- Embed arbitrary modalities (images, audio, documents, etc) into large language models.☆189Updated last year
- An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.☆266Updated 11 months ago
- Comparison of the output quality of quantization methods, using Llama 3, transformers, GGUF, EXL2.☆165Updated last year
- ☆109Updated 5 months ago
- Mistral7B playing DOOM☆139Updated last year
- LLaVA server (llama.cpp).☆183Updated 2 years ago
- AI management tool☆119Updated last year
- An unsupervised model merging algorithm for Transformers-based language models.☆108Updated last year
- A multimodal, function calling powered LLM webui.☆216Updated last year
- Fast parallel LLM inference for MLX☆246Updated last year
- ☆32Updated 2 years ago
- Formatron empowers everyone to control the format of language models' output with minimal overhead.☆234Updated 8 months ago
- Python bindings for ggml☆147Updated last year