smpanaro / apple-silicon-4bit-quantView external linksLinks
Supporting code for "LLMs for your iPhone: Whole-Tensor 4 Bit Quantization"
☆11Mar 31, 2024Updated last year
Alternatives and similar repositories for apple-silicon-4bit-quant
Users that are interested in apple-silicon-4bit-quant are comparing it to the libraries listed below
Sorting:
- ModernBERT model optimized for Apple Neural Engine.☆30Jan 10, 2025Updated last year
- Tool for visual profiling Core ML models, compatible with both package and compiled versions, including reasons for unsupported operation…☆37Jun 18, 2024Updated last year
- Train small sequence models in your browser with WebGPU.☆32Dec 3, 2025Updated 2 months ago
- Tool for exporting Apple Neural Engine-accelerated versions of transformers models on HuggingFace Hub.☆13May 2, 2023Updated 2 years ago
- Code for my workshop "Production-ready WebAssembly with Rust" presented at RustLab 2023 in Florence☆15Nov 23, 2023Updated 2 years ago
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆16Apr 21, 2025Updated 9 months ago
- See the device (CPU/GPU/ANE) and estimated cost for every layer in your CoreML model.☆25Oct 23, 2025Updated 3 months ago
- Find out why your CoreML model isn't running on the Neural Engine!☆30Jun 18, 2024Updated last year
- Run transformers (incl. LLMs) on the Apple Neural Engine.☆64Nov 22, 2023Updated 2 years ago
- Profile your CoreML models directly from Python 🐍☆30Sep 8, 2025Updated 5 months ago
- CLI to demonstrate running a large language model (LLM) on Apple Neural Engine.☆121Dec 27, 2024Updated last year
- EfficientVLM: Fast and Accurate Vision-Language Models via Knowledge Distillation and Modal-adaptive Pruning (ACL 2023)☆33Jul 18, 2023Updated 2 years ago
- A minimalistic Swift implementation of the Jinja templating engine, specifically designed for parsing and rendering ML chat templates.☆112Jan 25, 2026Updated 3 weeks ago
- ☆11Jan 7, 2023Updated 3 years ago
- CodePath Slackbot (Fred)☆11Mar 26, 2021Updated 4 years ago
- Example iOS app using the open-source combustion-ios-ble framework.☆11Aug 2, 2023Updated 2 years ago
- MLX Implementation of Recursive Reasoning with Tiny Networks☆78Oct 11, 2025Updated 4 months ago
- import documents for LLMs☆46Jan 19, 2025Updated last year
- Showing full TensorBoard support in Tensorflow for a CNN using MNIST data.☆13Oct 19, 2019Updated 6 years ago
- Fastai+PyTorch implementation of sparse model training methods (SET, SNFS, RigL) + customize-your-own.☆10Oct 20, 2022Updated 3 years ago
- Alias mutliple derives as one.☆11Nov 30, 2024Updated last year
- ☆16Apr 30, 2025Updated 9 months ago
- a tiny, portable, stackless coroutine in C++11☆11May 17, 2023Updated 2 years ago
- An interactive, story-based Web Monetization tutorial for online creators.☆11Mar 1, 2025Updated 11 months ago
- ☆11Apr 5, 2023Updated 2 years ago
- Try to export the ONNX QDQ model that conforms to the AXERA NPU quantization specification. Currently, only w8a8 is supported.☆11Sep 10, 2024Updated last year
- Symbolic Graphics Programming with Large Language Models☆37Sep 14, 2025Updated 5 months ago
- Nano vLLM☆12Jun 26, 2025Updated 7 months ago
- Tracks the Rerun open source work☆11Oct 3, 2022Updated 3 years ago
- ChineseCLIP using online learning☆13Nov 7, 2022Updated 3 years ago
- codes for ICML2021 paper iDARTS: Differentiable Architecture Search with Stochastic Implicit Gradients☆10May 27, 2021Updated 4 years ago
- Implementation of a Hierarchical Mamba as described in the paper: "Hierarchical State Space Models for Continuous Sequence-to-Sequence Mo…☆15Nov 11, 2024Updated last year
- This Elgg plugin lets users preview MS Office files (doc, docx, xls, xlsx, ppt, pptx), Apple iWork pages, Adobe eps, and zip files using …☆12Aug 28, 2015Updated 10 years ago
- An example of distributed tracing an MCP enabled agent☆15Feb 4, 2026Updated last week
- Optimize your WebAssembly files☆13Jul 4, 2025Updated 7 months ago
- Analyzes whole genome sequencing data for gene-editing verification☆10Feb 6, 2026Updated last week
- ☆10Nov 16, 2024Updated last year
- An MCP server for Google Scholar written in TypeScript with Streamable HTTP☆16Aug 18, 2025Updated 5 months ago
- Creole Network Monorepo☆11Dec 2, 2024Updated last year