furiousteabag / vram-calculatorLinks
Transformer GPU VRAM estimator
☆67Updated last year
Alternatives and similar repositories for vram-calculator
Users that are interested in vram-calculator are comparing it to the libraries listed below
Sorting:
- Aana SDK is a powerful framework for building AI enabled multimodal applications.☆55Updated 4 months ago
- ☆115Updated 11 months ago
- Just a bunch of benchmark logs for different LLMs☆119Updated last year
- inference code for mixtral-8x7b-32kseqlen☆104Updated 2 years ago
- Public reports detailing responses to sets of prompts by Large Language Models.☆32Updated 11 months ago
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆104Updated 7 months ago
- Train, tune, and infer Bamba model☆137Updated 6 months ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆59Updated 2 months ago
- ☆68Updated last year
- Simple examples using Argilla tools to build AI☆57Updated last year
- an implementation of Self-Extend, to expand the context window via grouped attention☆119Updated last year
- Simple high-throughput inference library☆154Updated 7 months ago
- Self-host LLMs with vLLM and BentoML☆163Updated last month
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆31Updated last year
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆150Updated this week
- Pivotal Token Search☆141Updated last week
- ☆101Updated last year
- ☆165Updated 4 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆53Updated 2 years ago
- Chat Markup Language conversation library☆55Updated last year
- Benchmarks comparing PyTorch and MLX on Apple Silicon GPUs☆92Updated last year
- Embedding models from Jina AI☆65Updated last year
- Your buddy in the (L)LM space.☆64Updated last year
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆32Updated last year
- GPT-4 Level Conversational QA Trained In a Few Hours☆66Updated last year
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆109Updated 9 months ago
- Leverage your LangChain trace data for fine tuning☆46Updated last year
- Train your own SOTA deductive reasoning model☆107Updated 9 months ago
- ☆74Updated 2 years ago
- IBM development fork of https://github.com/huggingface/text-generation-inference☆62Updated 3 months ago