furiousteabag / vram-calculator
Transformer GPU VRAM estimator
☆58Updated last year
Alternatives and similar repositories for vram-calculator:
Users that are interested in vram-calculator are comparing it to the libraries listed below
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆87Updated this week
- Public reports detailing responses to sets of prompts by Large Language Models.☆30Updated 3 months ago
- ☆66Updated 10 months ago
- ☆14Updated 7 months ago
- GRDN.AI app for garden optimization☆70Updated last year
- ☆29Updated 3 months ago
- A repository of projects and datasets under active development by Alignment Lab AI☆22Updated last year
- Lego for GRPO☆26Updated 2 weeks ago
- ⚡️ A fast and flexible PyTorch inference server that runs locally, on any cloud or AI HW.☆140Updated 10 months ago
- An example implementation of RLHF (or, more accurately, RLAIF) built on MLX and HuggingFace.☆25Updated 9 months ago
- 🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.☆136Updated 8 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆52Updated last year
- First token cutoff sampling inference example☆29Updated last year
- Hallucinations (Confabulations) Document-Based Benchmark for RAG. Includes human-verified questions and answers.☆121Updated this week
- ☆112Updated 3 months ago
- ☆60Updated last year
- A minimalistic C++ Jinja templating engine for LLM chat templates☆131Updated last week
- IBM development fork of https://github.com/huggingface/text-generation-inference☆60Updated 3 months ago
- look how they massacred my boy☆63Updated 6 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆95Updated last month
- ☆53Updated 11 months ago
- ☆48Updated last year
- Distributed Inference for mlx LLm☆87Updated 8 months ago
- Self-host LLMs with vLLM and BentoML☆102Updated this week
- Google TPU optimizations for transformers models☆107Updated 2 months ago
- GroqFlow provides an automated tool flow for compiling machine learning and linear algebra workloads into Groq programs and executing tho…☆109Updated last month
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆39Updated 2 months ago
- train with kittens!☆56Updated 5 months ago
- ☆207Updated 2 months ago
- Simple LLM inference server☆20Updated 10 months ago