huggingface / optimum-executorchLinks
π€ Optimum ExecuTorch
β54Updated last week
Alternatives and similar repositories for optimum-executorch
Users that are interested in optimum-executorch are comparing it to the libraries listed below
Sorting:
- β214Updated 5 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMsβ264Updated 9 months ago
- A safetensors extension to efficiently store sparse quantized tensors on diskβ135Updated this week
- Load compute kernels from the Hubβ207Updated this week
- A tool to configure, launch and manage your machine learning experiments.β171Updated this week
- AI Edge Quantizer: flexible post training quantization for LiteRT models.β53Updated last week
- A repository to unravel the language of GPUs, making their kernel conversations easy to understandβ188Updated last month
- Google TPU optimizations for transformers modelsβ116Updated 5 months ago
- Python bindings for ggmlβ142Updated 10 months ago
- Supporting PyTorch models with the Google AI Edge TFLite runtime.β706Updated this week
- Fast low-bit matmul kernels in Tritonβ330Updated last week
- LiteRT continues the legacy of TensorFlow Lite as the trusted, high-performance runtime for on-device AI. Now with LiteRT Next, we're expβ¦β651Updated this week
- Scalable and Performant Data Loadingβ288Updated last week
- PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"β64Updated 3 months ago
- Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)β361Updated this week
- Advanced Quantization Algorithm for LLMs and VLMs, with support for CPU, Intel GPU, CUDA and HPU. Seamlessly integrated with Torchao, Traβ¦β528Updated this week
- Model compression for ONNXβ96Updated 7 months ago
- Fast Matrix Multiplications for Lookup Table-Quantized LLMsβ373Updated 3 months ago
- Use safetensors with ONNX π€β67Updated 2 weeks ago
- Where GPUs get cooked π©βπ³π₯β236Updated 4 months ago
- A minimalistic C++ Jinja templating engine for LLM chat templatesβ160Updated this week
- Easy and lightning fast training of π€ Transformers on Habana Gaudi processor (HPU)β190Updated this week
- π Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.β206Updated this week
- β225Updated this week
- π Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flashβ¦β255Updated this week
- A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASSβ196Updated 2 months ago
- High-Performance SGEMM on CUDA devicesβ97Updated 5 months ago
- Official Problem Sets / Reference Kernels for the GPU MODE Leaderboard!β62Updated 2 weeks ago
- PyTorch Single Controllerβ318Updated this week
- Write a fast kernel and run it on Discord. See how you compare against the best!β46Updated 2 weeks ago