pytorch-labs / executorch-examplesLinks

Example apps and demos using PyTorch's ExecuTorch framework

☆11

Alternatives and similar repositories for executorch-examples

Users that are interested in executorch-examples are comparing it to the libraries listed below

Sorting:

huggingface / optimum-executorch
🤗 Optimum ExecuTorch
☆53Updated last week
argmaxinc / WhisperKitAndroid
On-device Speech Recognition for Android
☆104Updated this week
philipturner / metal-flash-attention
FlashAttention (Metal Port)
☆497Updated 9 months ago
google-ai-edge / LiteRT
LiteRT continues the legacy of TensorFlow Lite as the trusted, high-performance runtime for on-device AI. Now with LiteRT Next, we're exp…
☆595Updated this week
exo-explore / mlx-bitnet
1.58 Bit LLM on Apple Silicon using MLX
☆214Updated last year
ikawrakow / ik_llama.cpp
llama.cpp fork with additional SOTA quants and improved performance
☆608Updated this week
onnx / turnkeyml
No-code CLI designed for accelerating ONNX workflows
☆198Updated 2 weeks ago
stevelaskaridis / awesome-mobile-llm
Awesome Mobile LLMs
☆204Updated 3 weeks ago
zhouwg / ggml-hexagon
reference implementation of the backend for llama.cpp on Android phone equipped with Qualcomm's Hexagon NPU, details can be seen at http…
☆23Updated this week
google / minja
A minimalistic C++ Jinja templating engine for LLM chat templates
☆156Updated last month
HanGuo97 / flute
Fast Matrix Multiplications for Lookup Table-Quantized LLMs
☆371Updated 2 months ago
mlc-ai / tokenizers-cpp
Universal cross-platform tokenizers binding to HF and sentencepiece
☆350Updated this week
google-ai-edge / ai-edge-torch
Supporting PyTorch models with the Google AI Edge TFLite runtime.
☆678Updated this week
Dao-AILab / fast-hadamard-transform
Fast Hadamard transform in CUDA, with a PyTorch interface
☆201Updated last year
DakeQQ / YOLO-Depth-Estimation-for-Android
Demonstration of combine YOLO and depth estimation on Android device.
☆51Updated last month
Cornell-RelaxML / qtip
☆137Updated this week
apple / ml-recurrent-drafter
☆213Updated 5 months ago
lemonade-sdk / lemonade
Local LLM Server with GPU and NPU Acceleration
☆138Updated last week
ROCm / aiter
AI Tensor Engine for ROCm
☆208Updated this week
ngxson / ggml-easy
Thin wrapper around GGML to make life easier
☆35Updated 3 weeks ago
google-ai-edge / ai-edge-quantizer
AI Edge Quantizer: flexible post training quantization for LiteRT models.
☆49Updated this week
balisujohn / tortoise.cpp
A ggml (C++) re-implementation of tortoise-tts
☆187Updated 10 months ago
sanctuary-systems-com / llama_multiserver
A proxy that hosts multiple single-model runners such as LLama.cpp and vLLM
☆11Updated 3 weeks ago
smpanaro / more-ane-transformers
Run transformers (incl. LLMs) on the Apple Neural Engine.
☆61Updated last year
ARM-software / kleidiai
This repository is a read-only mirror of https://gitlab.arm.com/kleidi/kleidiai
☆51Updated last week
Repeerc / flash-attention-v2-RDNA3-minimal
a simple Flash Attention v2 implementation with ROCM (RDNA3 GPU, roc wmma), mainly used for stable diffusion(ComfyUI) in Windows ZLUDA en…
☆43Updated 10 months ago
microsoft / onnxscript
ONNX Script enables developers to naturally author ONNX functions and models using a subset of Python.
☆360Updated this week
Infini-AI-Lab / UMbreLLa
LLM Inference on consumer devices
☆119Updated 3 months ago
flawedmatrix / mamba-ssm
Implementation of mamba with rust
☆87Updated last year
neuralmagic / compressed-tensors
A safetensors extension to efficiently store sparse quantized tensors on disk
☆129Updated this week