furiousteabag / vram-calculator
Transformer GPU VRAM estimator
☆58Updated 11 months ago
Alternatives and similar repositories for vram-calculator:
Users that are interested in vram-calculator are comparing it to the libraries listed below
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆31Updated 10 months ago
- ☆22Updated last year
- GRDN.AI app for garden optimization☆70Updated last year
- Aana SDK is a powerful framework for building AI enabled multimodal applications.☆43Updated last week
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆91Updated 2 weeks ago
- IBM development fork of https://github.com/huggingface/text-generation-inference☆60Updated 3 months ago
- Public reports detailing responses to sets of prompts by Large Language Models.☆30Updated 2 months ago
- ☆112Updated last month
- GroqFlow provides an automated tool flow for compiling machine learning and linear algebra workloads into Groq programs and executing tho…☆108Updated 2 weeks ago
- ☆60Updated last year
- A minimalistic C++ Jinja templating engine for LLM chat templates☆128Updated 2 weeks ago
- inference code for mixtral-8x7b-32kseqlen☆99Updated last year
- ☆22Updated last year
- Because it's there.☆15Updated 6 months ago
- Ongoing research training transformer models at scale☆35Updated last year
- Blindspots in LLMs I've noticed while AI coding. Sonnet family emphasis.☆11Updated this week
- Prepare for DeekSeek R1 inference: Benchmark CPU, DRAM, SSD, iGPU, GPU, ... with efficient code.☆70Updated last month
- ☆38Updated 7 months ago
- look how they massacred my boy☆63Updated 5 months ago
- Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.☆31Updated 3 weeks ago
- Embedding models from Jina AI☆58Updated last year
- Self-hosted LLM chatbot arena, with yourself as the only judge☆38Updated last year
- Editor with LLM generation tree exploration☆65Updated last month
- Modular, open source LLMOps stack that separates concerns: LiteLLM unifies LLM APIs, manages routing and cost controls, and ensures high-…☆89Updated last month
- ☆73Updated last year
- Hallucinations (Confabulations) Document-Based Benchmark for RAG. Includes human-verified questions and answers.☆116Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆51Updated last year
- Chat Markup Language conversation library☆55Updated last year
- Distributed Inference for mlx LLm☆87Updated 7 months ago
- A clone of OpenAI's Tokenizer page for HuggingFace Models☆45Updated last year