intel / neural-speed

An innovative library for efficient LLM inference via low-bit quantization
348Updated 2 months ago

Related projects

Alternatives and complementary repositories for neural-speed