RahulSChand / gpu_poor
Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization
☆1,288Updated 4 months ago
Alternatives and similar repositories for gpu_poor:
Users that are interested in gpu_poor are comparing it to the libraries listed below
- LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalabili…☆3,152Updated this week
- Minimalistic large language model 3D-parallelism training☆1,808Updated this week
- AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:☆2,104Updated 2 weeks ago
- Fast, Flexible and Portable Structured Generation☆888Updated 2 weeks ago
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.☆2,360Updated last week
- LLMPerf is a library for validating and benchmarking LLMs☆876Updated 4 months ago
- A collection of awesome-prompt-datasets, awesome-instruction-dataset, to train ChatLLM such as chatgpt 收录各种各样的指令数据集, 用于训练 ChatLLM 模型。☆654Updated last year
- A library for advanced large language model reasoning☆2,099Updated 2 weeks ago
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆2,659Updated this week
- Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads☆2,503Updated 10 months ago
- Summarize existing representative LLMs text datasets.☆1,247Updated last month
- Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM☆1,251Updated this week
- Tools for merging pretrained large language models.☆5,571Updated this week
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆1,438Updated last week
- An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)☆4,497Updated 2 weeks ago
- A family of open-sourced Mixture-of-Experts (MoE) Large Language Models☆1,508Updated last year
- ☆936Updated 2 months ago
- A quick guide (especially) for trending instruction finetuning datasets☆3,011Updated last year
- [MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration☆2,942Updated last week
- ☆1,161Updated last month
- S-LoRA: Serving Thousands of Concurrent LoRA Adapters☆1,818Updated last year
- A reading list on LLM based Synthetic Data Generation 🔥☆1,246Updated 2 months ago
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆1,829Updated 8 months ago
- Data and tools for generating and inspecting OLMo pre-training data.☆1,198Updated last week
- Chat Templates for 🤗 HuggingFace Large Language Models☆651Updated 4 months ago
- MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.☆1,288Updated this week
- An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.☆4,818Updated 2 weeks ago
- [ICLR 2025] Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing. Your efficient and high-quality synthetic data …☆681Updated last month
- LMDeploy is a toolkit for compressing, deploying, and serving LLMs.☆6,183Updated this week
- AllenAI's post-training codebase☆2,913Updated this week