furiousteabag / vram-calculatorLinks
Transformer GPU VRAM estimator
☆67Updated last year
Alternatives and similar repositories for vram-calculator
Users that are interested in vram-calculator are comparing it to the libraries listed below
Sorting:
- Train, tune, and infer Bamba model☆138Updated 7 months ago
- Self-host LLMs with vLLM and BentoML☆163Updated this week
- Pivotal Token Search☆142Updated last month
- This is the documentation repository for SGLang. It is auto-generated from https://github.com/sgl-project/sglang☆97Updated this week
- An OpenAI Completions API compatible server for NLP transformers models☆66Updated 2 years ago
- ☆115Updated 11 months ago
- Aana SDK is a powerful framework for building AI enabled multimodal applications.☆55Updated 4 months ago
- ☆67Updated 9 months ago
- Public reports detailing responses to sets of prompts by Large Language Models.☆32Updated last year
- ☆68Updated last year
- 🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.☆138Updated last year
- GRDN.AI app for garden optimization☆69Updated 2 months ago
- Simple examples using Argilla tools to build AI☆57Updated last year
- A tool that facilitates easy, efficient and high-quality fine-tuning of Cohere's models☆76Updated 10 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆53Updated 2 years ago
- GPT-4 Level Conversational QA Trained In a Few Hours☆66Updated last year
- ☆198Updated last year
- IBM development fork of https://github.com/huggingface/text-generation-inference☆63Updated 4 months ago
- Benchmarks comparing PyTorch and MLX on Apple Silicon GPUs☆92Updated last week
- Simple high-throughput inference library☆155Updated 8 months ago
- Chat Markup Language conversation library☆55Updated 2 years ago
- Just a bunch of benchmark logs for different LLMs☆119Updated last year
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆107Updated 8 months ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆59Updated 3 months ago
- The backend behind the LLM-Perf Leaderboard☆11Updated last year
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆114Updated 9 months ago
- A collection of all available inference solutions for the LLMs☆94Updated 10 months ago
- Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first app…☆169Updated 2 years ago
- Machine Learning Serving focused on GenAI with simplicity as the top priority.☆59Updated 2 weeks ago
- ☆165Updated 5 months ago