NEO-MLSys25 / NEOLinks

NEO is a LLM inference engine built to save the GPU memory crisis by CPU offloading
44Updated last month

Alternatives and similar repositories for NEO

Users that are interested in NEO are comparing it to the libraries listed below

Sorting: