AlexBuz / llama-zipLinks
LLM-powered lossless compression tool
☆295Updated last year
Alternatives and similar repositories for llama-zip
Users that are interested in llama-zip are comparing it to the libraries listed below
Sorting:
- Train your own small bitnet model☆76Updated last year
- A fast batching API to serve LLM models☆189Updated last year
- Experimental adventure game with AI-generated content☆111Updated 8 months ago
- Comparison of the output quality of quantization methods, using Llama 3, transformers, GGUF, EXL2.☆165Updated last year
- Falcon LLM ggml framework with CPU and GPU support☆249Updated last year
- Stop messing around with finicky sampling parameters and just use DRµGS!☆360Updated last year
- ☆333Updated 5 months ago
- 1.58-bit LLaMa model☆83Updated last year
- Testing LLM reasoning abilities with family relationship quizzes.☆63Updated 11 months ago
- AI management tool☆121Updated last year
- This is our own implementation of 'Layer Selective Rank Reduction'☆240Updated last year
- klmbr - a prompt pre-processing technique to break through the barrier of entropy while generating text with LLMs☆86Updated last year
- Inference of Mamba models in pure C☆196Updated last year
- ☆135Updated 8 months ago
- LLM-based code completion engine☆190Updated 11 months ago
- Low-Rank adapter extraction for fine-tuned transformers models☆180Updated last year
- ☆165Updated 4 months ago
- Web UI for ExLlamaV2☆514Updated 10 months ago
- automatically quant GGUF models☆218Updated last week
- Experimental LLM Inference UX to aid in creative writing☆127Updated last year
- Mistral7B playing DOOM☆138Updated last year
- Like grep but for natural language questions. Based on Mistral 7B or Mixtral 8x7B.☆386Updated last year
- A multimodal, function calling powered LLM webui.☆217Updated last year
- The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM …☆610Updated 10 months ago
- ☆32Updated 2 years ago
- Fast parallel LLM inference for MLX☆240Updated last year
- An implementation of bucketMul LLM inference☆223Updated last year
- Convenience scripts to finetune (chat-)LLaMa3 and other models for any language☆314Updated last year
- Embed arbitrary modalities (images, audio, documents, etc) into large language models.☆188Updated last year
- TypeScript generator for llama.cpp Grammar directly from TypeScript interfaces☆141Updated last year