huggingface / optimum-amdLinks
AMD related optimizations for transformer models
☆96Updated last month
Alternatives and similar repositories for optimum-amd
Users that are interested in optimum-amd are comparing it to the libraries listed below
Sorting:
- An innovative library for efficient LLM inference via low-bit quantization☆350Updated last year
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆93Updated this week
- Fast and memory-efficient exact attention☆202Updated this week
- No-code CLI designed for accelerating ONNX workflows☆219Updated 5 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆267Updated last year
- ☆219Updated 10 months ago
- A safetensors extension to efficiently store sparse quantized tensors on disk☆214Updated last week
- 🏋️ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of O…☆320Updated 2 months ago
- llama.cpp to PyTorch Converter☆34Updated last year
- ☆120Updated last year
- Prepare for DeekSeek R1 inference: Benchmark CPU, DRAM, SSD, iGPU, GPU, ... with efficient code.☆73Updated 10 months ago
- ☆170Updated 3 weeks ago
- ☆111Updated 3 weeks ago
- ☆158Updated 5 months ago
- Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)☆201Updated last week
- Development repository for the Triton language and compiler☆137Updated last week
- 👷 Build compute kernels☆192Updated this week
- Advanced quantization toolkit for LLMs and VLMs. Native support for WOQ, MXFP4, NVFP4, GGUF, Adaptive Schemes and seamless integration wi…☆753Updated this week
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆63Updated 5 months ago
- Onboarding documentation source for the AMD Ryzen™ AI Software Platform. The AMD Ryzen™ AI Software Platform enables developers to take…☆87Updated last week
- 🤗 Optimum Intel: Accelerate inference with Intel optimization tools☆515Updated this week
- Inference server benchmarking tool☆130Updated 2 months ago
- ☆78Updated last year
- Use safetensors with ONNX 🤗☆76Updated 2 months ago
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆114Updated this week
- Easy and Efficient Quantization for Transformers☆203Updated 5 months ago
- High-speed and easy-use LLM serving framework for local deployment☆137Updated 4 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆111Updated this week
- python package of rocm-smi-lib☆24Updated last week
- A general 2-8 bits quantization toolbox with GPTQ/AWQ/HQQ/VPTQ, and export to onnx/onnx-runtime easily.☆183Updated 8 months ago