A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
☆64Oct 13, 2023Updated 2 years ago
Alternatives and similar repositories for exllama
Users that are interested in exllama are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆2,921Sep 30, 2023Updated 2 years ago
- Boosting Natural Language Generation from Instructions with Meta-Learning☆11Dec 20, 2022Updated 3 years ago
- Create Unmute voice embeddings☆25Nov 15, 2025Updated 6 months ago
- Let us make Psychohistory (as in Asimov) a reality, and accessible to everyone. Useful for LLM grounding and games / fiction / business /…☆40Apr 9, 2023Updated 3 years ago
- Torchserve + TensorRT + Detection☆19Feb 16, 2022Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- LLM finetuning☆42Aug 9, 2023Updated 2 years ago
- WebUI StartGUI is a Python graphical user interface (GUI) written with PyQT5, that allows users to configure settings and start the oobab…☆16Jun 3, 2023Updated 2 years ago
- Produce your own Dynamic 3.0 Quants and achieve optimum accuracy & SOTA quantization performance! Input a target size and the toolchain w…☆130May 11, 2026Updated last week
- Multichannel Looper/Feedback System for Riffusion☆14May 6, 2023Updated 3 years ago
- An insanely secure password manager.☆17Mar 10, 2026Updated 2 months ago
- An intelligent code optimization system leveraging AI analysis, automated refactoring, and test generation. Built with DSPy and Gradio, i…☆20Feb 1, 2025Updated last year
- An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.☆5,060Apr 11, 2025Updated last year
- A trade robot on pumpfun use DeepSeek AI☆12Feb 5, 2025Updated last year
- A transformers implementation of csm-streaming☆30May 16, 2025Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- A simple Node.js server to run nsfw.js for images from IPFS and return its results.☆14Sep 27, 2022Updated 3 years ago
- Generate images from an initial frame and text☆37Jul 28, 2023Updated 2 years ago
- Prototype UI for chatting with the Pygmalion models.☆237Jun 1, 2023Updated 2 years ago
- High-performance ASR tool using Faster Whisper, supporting custom models, multi-language transcription, and real-time processing feedback…☆10Sep 17, 2025Updated 8 months ago
- A fast inference library for running LLMs locally on modern consumer-class GPUs☆4,521Mar 4, 2026Updated 2 months ago
- ☆54Jun 11, 2023Updated 2 years ago
- dgenerate is a scriptable command line tool (and library) for generating images and animation sequences using stable diffusion and relate…☆44Oct 15, 2025Updated 7 months ago
- ☆13Nov 3, 2021Updated 4 years ago
- python package of rocm-smi-lib☆25Dec 15, 2025Updated 5 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Extension for Text Generation Webui based on EdgeGPT, a reverse engineered API of Microsoft's Bing Chat AI☆124Oct 2, 2023Updated 2 years ago
- ☆21Sep 11, 2023Updated 2 years ago
- Tokenizer for Text to Speech (TTS) models☆13Jan 16, 2025Updated last year
- An auto save extension for text generated with the oobabooga WebUI☆26Oct 6, 2025Updated 7 months ago
- ☆21Mar 3, 2025Updated last year
- Steering LLM Thinking with Budget Guidance☆30Feb 19, 2026Updated 3 months ago
- API for extending the Obsidian plugin Juggl☆29Nov 5, 2023Updated 2 years ago
- ☆16Feb 10, 2023Updated 3 years ago
- ☆17Jan 2, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- PowerShell based network attached device monitor☆11Sep 3, 2024Updated last year
- Calling LLM APIs on a Raspberry Pi for lulz☆24Apr 17, 2023Updated 3 years ago
- GPTQ inference Triton kernel☆322May 18, 2023Updated 3 years ago
- Personalized all-purpose AI assistance platform based on hierarchical cooperative multi-agent framework which utilizes websocket connecti…☆38Aug 11, 2024Updated last year
- A libp2p node with rpc using protocol buffers☆16Dec 7, 2022Updated 3 years ago
- 4 bits quantization of LLaMA using GPTQ☆3,072Jul 13, 2024Updated last year
- ☆136May 3, 2026Updated 2 weeks ago