huggingface / optimum-executorchLinks
π€ Optimum ExecuTorch
β67Updated last week
Alternatives and similar repositories for optimum-executorch
Users that are interested in optimum-executorch are comparing it to the libraries listed below
Sorting:
- A safetensors extension to efficiently store sparse quantized tensors on diskβ167Updated this week
- Load compute kernels from the Hubβ293Updated last week
- Google TPU optimizations for transformers modelsβ120Updated 8 months ago
- Fast low-bit matmul kernels in Tritonβ376Updated last week
- A high-throughput and memory-efficient inference and serving engine for LLMsβ265Updated 11 months ago
- Inference Vision Transformer (ViT) in plain C/C++ with ggmlβ296Updated last year
- Explore training for quantized modelsβ24Updated 2 months ago
- Use safetensors with ONNX π€β69Updated last week
- β218Updated 8 months ago
- Python bindings for ggmlβ146Updated last year
- AI Edge Quantizer: flexible post training quantization for LiteRT models.β69Updated this week
- Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)β414Updated last week
- A tool to configure, launch and manage your machine learning experiments.β195Updated last week
- A minimalistic C++ Jinja templating engine for LLM chat templatesβ187Updated 2 weeks ago
- Advanced Quantization Algorithm for LLMs and VLMs, with support for CPU, Intel GPU, CUDA and HPU.β651Updated this week
- Official implementation for Training LLMs with MXFP4β93Updated 5 months ago
- β333Updated 3 weeks ago
- This repository contains the experimental PyTorch native float8 training UXβ224Updated last year
- Fast Matrix Multiplications for Lookup Table-Quantized LLMsβ374Updated 5 months ago
- Write a fast kernel and run it on Discord. See how you compare against the best!β58Updated last week
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"β154Updated 11 months ago
- An innovative library for efficient LLM inference via low-bit quantizationβ350Updated last year
- β89Updated last year
- Open-source reproducible benchmarks from Argmaxβ60Updated last week
- π Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flashβ¦β269Updated 2 months ago
- π· Build compute kernelsβ155Updated this week
- Applied AI experiments and examples for PyTorchβ296Updated last month
- Official implementation of Half-Quadratic Quantization (HQQ)β879Updated last month
- Scalable and Performant Data Loadingβ304Updated this week
- β21Updated 7 months ago