Comparison of the output quality of quantization methods, using Llama 3, transformers, GGUF, EXL2.
☆169May 16, 2024Updated last year
Alternatives and similar repositories for llama-3-quant-comparison
Users that are interested in llama-3-quant-comparison are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Attend - to what matters.☆17Feb 22, 2025Updated last year
- AirLLM 70B inference with single 4GB GPU☆20Jun 27, 2025Updated 9 months ago
- Web Interface for Vision Language Models Including InternVLM2☆26Jul 29, 2024Updated last year
- Experimental LLM Inference UX to aid in creative writing☆128Dec 14, 2024Updated last year
- Web UI for ExLlamaV2☆511Feb 5, 2025Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- The official API server for Exllama. OAI compatible, lightweight, and fast.☆1,162Updated this week
- A data visualisation of a 100 responses when asking local LLMs to imagine a random person.☆24Nov 4, 2024Updated last year
- ☆23Jun 4, 2024Updated last year
- an auto-sleeping and -waking framework around llama.cpp☆12Feb 8, 2025Updated last year
- Who needs o1 anyways. Add CoT to any OpenAI compatible endpoint.☆44Sep 17, 2024Updated last year
- Experimental sampler to make LLMs more creative☆31Aug 2, 2023Updated 2 years ago
- Mixture-of-Ollamas☆30Aug 12, 2024Updated last year
- This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?☆1,488Nov 13, 2025Updated 4 months ago
- ☆16Jul 13, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- llama.cpp fork with additional SOTA quants and improved performance☆1,895Updated this week
- Reliable model swapping for any local OpenAI/Anthropic compatible server - llama.cpp, vllm, etc☆2,945Updated this week
- Simple LLM inference server☆20Jun 13, 2024Updated last year
- ☆73Jun 20, 2025Updated 9 months ago
- A stock market bot that automatically, once a day, rebalances your Robinhood portfolio by gathering information about each ticker in the …☆62Feb 25, 2025Updated last year
- Large-scale LLM inference engine☆1,681Mar 12, 2026Updated 2 weeks ago
- Practical and advanced guide to LLMOps. It provides a solid understanding of large language models’ general concepts, deployment techniqu…☆80Aug 16, 2024Updated last year
- A very simple interactive demo to understand the common LLM samplers.☆41Jul 9, 2024Updated last year
- Copilot with deepseek and more...☆13Mar 7, 2025Updated last year
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- a browser gui for nvidia smi☆20Mar 17, 2025Updated last year
- ☆35May 9, 2024Updated last year
- SLOP Detector and analyzer based on dictionary for shareGPT JSON and text☆84Mar 12, 2026Updated 2 weeks ago
- Copy a bunch of files into your clipboard to provide context for LLMs☆113Feb 8, 2026Updated last month
- Benchmark evaluating LLMs on their ability to create and resist disinformation. Includes comprehensive testing across major models (Claud…☆31Mar 20, 2025Updated last year
- ☆15Feb 1, 2025Updated last year
- Official repository for the paper "NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks". This rep…☆60Oct 31, 2024Updated last year
- [ICLR 2024] Jaiswal, A., Gan, Z., Du, X., Zhang, B., Wang, Z., & Yang, Y. Compressing llms: The truth is rarely pure and never simple.☆27Apr 21, 2025Updated 11 months ago
- A fast inference library for running LLMs locally on modern consumer-class GPUs☆4,476Mar 4, 2026Updated 3 weeks ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Docker images and configuration to run text-generation-webui with GPU or CPU support☆32Mar 19, 2024Updated 2 years ago
- LexiCrawler is a powerful Go-based web crawling API meticulously designed to extract, clean, and transform web page content into a pristi…☆48Feb 27, 2025Updated last year
- The one who calls upon functions - Function-Calling Language Model☆36Oct 2, 2023Updated 2 years ago
- ☆19Jul 12, 2025Updated 8 months ago
- The hearth of The Pulsar App, fast, secure and shared inference with modern UI☆59Dec 1, 2024Updated last year
- Glyphs, acting as collaboratively defined symbols linking related concepts, add a layer of multidimensional semantic richness to user-AI …☆56Feb 10, 2025Updated last year
- CaSIL is an advanced natural language processing system that implements a sophisticated four-layer semantic analysis architecture. It pro…☆67Nov 5, 2024Updated last year