NEO-MLSys25 / NEO

NEO is a LLM inference engine built to save the GPU memory crisis by CPU offloading
19Updated last month

Alternatives and similar repositories for NEO:

Users that are interested in NEO are comparing it to the libraries listed below