Mininglamp-AI / ciderView on GitHub
W8A8/W4A8 inference + optimized SDPA on Apple Silicon — unlocking unused INT8 TensorOps in M5 for 1.2–1.9× faster LLM prefill, plus FlashInfer-inspired GQA decode attention for up to 1.6× SDPA speedup, built as MLX custom primitives.
428Jun 5, 2026Updated last week

Alternatives and similar repositories for cider

Users that are interested in cider are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Are these results useful?