furiousteabag / vram-calculatorLinks

Transformer GPU VRAM estimator

☆66

Alternatives and similar repositories for vram-calculator

Users that are interested in vram-calculator are comparing it to the libraries listed below

Sorting:

llmonpy / needle-in-a-needlestack
☆116Updated 9 months ago
dropbox / aana_sdk
Aana SDK is a powerful framework for building AI enabled multimodal applications.
☆53Updated 2 months ago
foundation-model-stack / bamba
Train, tune, and infer Bamba model
☆136Updated 5 months ago
vikhyat / mixtral-inference
inference code for mixtral-8x7b-32kseqlen
☆102Updated last year
codelion / pts
Pivotal Token Search
☆131Updated 4 months ago
teknium1 / LLM-Logbook
Public reports detailing responses to sets of prompts by Large Language Models.
☆32Updated 10 months ago
4dh / GRDN
GRDN.AI app for garden optimization
☆70Updated last year
QuixiAI / kraken
☆67Updated last year
firstbatchxyz / function-calling-eval
The DPAB-α Benchmark
☆30Updated 10 months ago
bentoml / BentoVLLM
Self-host LLMs with vLLM and BentoML
☆156Updated 3 weeks ago
AlexBodner / How_Much_VRAM
☆102Updated last year
s-smits / grpo-optuna
Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna
☆58Updated last month
substratusai / vllm-docker
☆64Updated 7 months ago
LAION-AI / AIW
Alice in Wonderland code base for experiments and raw experiments data
☆131Updated last month
Mihaiii / backtrack_sampler
An easy-to-understand framework for LLM samplers that rewind and revise generated tokens
☆145Updated 8 months ago
teknium1 / LLM-Benchmark-Logs
Just a bunch of benchmark logs for different LLMs
☆118Updated last year
OpenPipe / deductive-reasoning
Train your own SOTA deductive reasoning model
☆108Updated 8 months ago
Cerebras / DocChat
GPT-4 Level Conversational QA Trained In a Few Hours
☆65Updated last year
argilla-io / argilla-cookbook
Simple examples using Argilla tools to build AI
☆56Updated last year
tensorwavecloud / ScalarLM
ScalarLM - a unified training and inference stack
☆93Updated last week
mistralai / vllm-release
A high-throughput and memory-efficient inference and serving engine for LLMs
☆53Updated last year
IBM / text-generation-inference
IBM development fork of https://github.com/huggingface/text-generation-inference
☆62Updated 2 months ago
tiiuae / onebitllms
Lightweight toolkit package to train and fine-tune 1.58bit Language models
☆98Updated 5 months ago
huggingface / optimum-tpu
Google TPU optimizations for transformers models
☆122Updated 9 months ago
LucasSte / MLX-vs-Pytorch
Benchmarks comparing PyTorch and MLX on Apple Silicon GPUs
☆89Updated last year
xjdr-alt / llmri
look how they massacred my boy
☆63Updated last year
QuixiAI / OpenChatML
☆163Updated 3 months ago
facebookresearch / matrix
Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…
☆100Updated last week
teknium1 / transformers-gptq-quant
☆45Updated 2 years ago
IlyasMoutawwakil / llm-perf-backend
The backend behind the LLM-Perf Leaderboard
☆11Updated last year