smpanaro/apple-silicon-4bit-quant

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/smpanaro/apple-silicon-4bit-quant)

smpanaro / apple-silicon-4bit-quant

Supporting code for "LLMs for your iPhone: Whole-Tensor 4 Bit Quantization"

☆11

Alternatives and similar repositories for apple-silicon-4bit-quant

Users that are interested in apple-silicon-4bit-quant are comparing it to the libraries listed below

Sorting:

smpanaro / ModernBERT-AppleNeuralEngine
View on GitHub
ModernBERT model optimized for Apple Neural Engine.
☆31Jan 10, 2025Updated last year
fguzman82 / CoreMLProfiler
View on GitHub
Tool for visual profiling Core ML models, compatible with both package and compiled versions, including reasons for unsupported operation…
☆38Jun 18, 2024Updated last year
vinhowe / piston
View on GitHub
Train small sequence models in your browser with WebGPU.
☆32Dec 3, 2025Updated 3 months ago
jkomyno / workshop-rust-wasm
View on GitHub
Code for my workshop "Production-ready WebAssembly with Rust" presented at RustLab 2023 in Florence
☆15Nov 23, 2023Updated 2 years ago
anentropic / hft2ane
View on GitHub
Tool for exporting Apple Neural Engine-accelerated versions of transformers models on HuggingFace Hub.
☆13Updated this week
VITA-Group / Junk_DNA_Hypothesis
View on GitHub
[ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…
☆16Apr 21, 2025Updated 10 months ago
smpanaro / CoreMLInspect
View on GitHub
See the device (CPU/GPU/ANE) and estimated cost for every layer in your CoreML model.
☆25Oct 23, 2025Updated 4 months ago
FL33TW00D / deCoreML
View on GitHub
Find out why your CoreML model isn't running on the Neural Engine!
☆30Jun 18, 2024Updated last year
smpanaro / more-ane-transformers
View on GitHub
Run transformers (incl. LLMs) on the Apple Neural Engine.
☆63Nov 22, 2023Updated 2 years ago
FL33TW00D / coremlprofiler
View on GitHub
Profile your CoreML models directly from Python 🐍
☆30Sep 8, 2025Updated 6 months ago
smpanaro / coreml-llm-cli
View on GitHub
CLI to demonstrate running a large language model (LLM) on Apple Neural Engine.
☆124Dec 27, 2024Updated last year
swaggy-TN / EfficientVLM
View on GitHub
EfficientVLM: Fast and Accurate Vision-Language Models via Knowledge Distillation and Modal-adaptive Pruning (ACL 2023)
☆33Jul 18, 2023Updated 2 years ago
huggingface / swift-jinja
View on GitHub
A minimalistic Swift implementation of the Jinja templating engine, specifically designed for parsing and rendering ML chat templates.
☆119Feb 19, 2026Updated 2 weeks ago
codepath / slackbot
View on GitHub
CodePath Slackbot (Fred)
☆11Mar 26, 2021Updated 4 years ago
stockeh / mlx-trm
View on GitHub
MLX Implementation of Recursive Reasoning with Tiny Networks
☆78Oct 11, 2025Updated 4 months ago
prismicio / vuejs-blog
View on GitHub
☆11Jan 7, 2023Updated 3 years ago
combustion-inc / combustion-ios-example
View on GitHub
Example iOS app using the open-source combustion-ios-ble framework.
☆11Aug 2, 2023Updated 2 years ago
Blizaine / Qwen3-TTS-MLX-WebUI-Enhanced
View on GitHub
Qwen3-TTS, Apple MLX, WebUI, API Server
☆34Feb 12, 2026Updated 3 weeks ago
PicoMLX / PicoDocs
View on GitHub
import documents for LLMs
☆46Jan 19, 2025Updated last year
thomasyung / GDocs-File-Previewer
View on GitHub
This Elgg plugin lets users preview MS Office files (doc, docx, xls, xlsx, ppt, pptx), Apple iWork pages, Adobe eps, and zip files using …
☆12Aug 28, 2015Updated 10 years ago
benchplus / gocache
View on GitHub
Continuous Benchmark for cache libraries written in golang.
☆12Mar 26, 2023Updated 2 years ago
dcato98 / fastsparse
View on GitHub
Fastai+PyTorch implementation of sparse model training methods (SET, SNFS, RigL) + customize-your-own.
☆10Oct 20, 2022Updated 3 years ago
ITERATOR-ACM-TEAM / VIPainter
View on GitHub
Vectorgraph Image Painter
☆12Mar 24, 2019Updated 6 years ago
hyperliquid-dex / historical_data
View on GitHub
☆10May 2, 2023Updated 2 years ago
GridGain-Demos / imc-essentials-in-90-minutes
View on GitHub
O'Reilly Course, In-Memory Computing Essentials
☆10Oct 16, 2020Updated 5 years ago
gpuweb / tree-sitter-wgsl
View on GitHub
☆13Nov 27, 2025Updated 3 months ago
DensoITLab / bitprune
View on GitHub
☆11Apr 5, 2023Updated 2 years ago
Sphere-AI-Lab / SGP-RL
View on GitHub
Symbolic Graphics Programming with Large Language Models
☆37Sep 14, 2025Updated 5 months ago
miaozhang0525 / iDARTS
View on GitHub
codes for ICML2021 paper iDARTS: Differentiable Architecture Search with Stochastic Implicit Gradients
☆10May 27, 2021Updated 4 years ago
vsingh-group / FrameQuant
View on GitHub
☆10Nov 16, 2024Updated last year
shovelers / network-monorepo
View on GitHub
Creole Network Monorepo
☆11Dec 2, 2024Updated last year
SherlockGougou / Matisse
View on GitHub
知乎图片选择框架的优化版本，增加是否选择原图功能，可显示原图大小，状态栏颜色自适应；解决无法显示某些大图的bug。
☆11Sep 11, 2019Updated 6 years ago
dylibso / mcp-otel
View on GitHub
An example of distributed tracing an MCP enabled agent
☆15Feb 14, 2026Updated 3 weeks ago
crazybie / co
View on GitHub
a tiny, portable, stackless coroutine in C++11
☆11May 17, 2023Updated 2 years ago
AXERA-TECH / yolov5-qat
View on GitHub
Try to export the ONNX QDQ model that conforms to the AXERA NPU quantization specification. Currently, only w8a8 is supported.
☆10Sep 10, 2024Updated last year
ibraheemdev / derive-alias
View on GitHub
Alias mutliple derives as one.
☆11Nov 30, 2024Updated last year
leolee99 / Online-CNCLIP
View on GitHub
ChineseCLIP using online learning
☆13Nov 7, 2022Updated 3 years ago
TencentARC / common_trainer
View on GitHub
Common template for pytorch project. Easy to extent and modify for new project.
☆13Dec 13, 2022Updated 3 years ago
duterscmy / CD-MoE
View on GitHub
Official PyTorch implementation of CD-MOE
☆12Mar 29, 2025Updated 11 months ago