furiousteabag / vram-calculatorLinks
Transformer GPU VRAM estimator
☆67Updated last year
Alternatives and similar repositories for vram-calculator
Users that are interested in vram-calculator are comparing it to the libraries listed below
Sorting:
- ☆116Updated 8 months ago
- Aana SDK is a powerful framework for building AI enabled multimodal applications.☆52Updated 2 months ago
- Pivotal Token Search☆130Updated 3 months ago
- Public reports detailing responses to sets of prompts by Large Language Models.☆31Updated 9 months ago
- inference code for mixtral-8x7b-32kseqlen☆102Updated last year
- Train, tune, and infer Bamba model☆135Updated 4 months ago
- Benchmarks comparing PyTorch and MLX on Apple Silicon GPUs☆89Updated last year
- ☆67Updated last year
- IBM development fork of https://github.com/huggingface/text-generation-inference☆61Updated last month
- an implementation of Self-Extend, to expand the context window via grouped attention☆118Updated last year
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆58Updated last week
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆145Updated 8 months ago
- Chat Markup Language conversation library☆55Updated last year
- This is the documentation repository for SGLang. It is auto-generated from https://github.com/sgl-project/sglang/tree/main/docs.☆86Updated this week
- ⚡️ A fast and flexible PyTorch inference server that runs locally, on any cloud or AI HW.☆145Updated last year
- An OpenAI Completions API compatible server for NLP transformers models☆64Updated last year
- Implementation of nougat that focuses on processing pdf locally.☆83Updated 9 months ago
- Just a bunch of benchmark logs for different LLMs☆118Updated last year
- Inference of Mamba models in pure C☆192Updated last year
- ☆23Updated 8 months ago
- GRDN.AI app for garden optimization☆70Updated last year
- ☆46Updated 2 years ago
- SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?☆202Updated this week
- ☆64Updated 7 months ago
- A tool that facilitates easy, efficient and high-quality fine-tuning of Cohere's models☆73Updated 7 months ago
- ☆63Updated 4 months ago
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆92Updated 5 months ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆90Updated this week
- Google TPU optimizations for transformers models☆121Updated 9 months ago
- ☆197Updated last year