NEO-MLSys25 / NEOLinks

NEO is a LLM inference engine built to save the GPU memory crisis by CPU offloading
39Updated last week

Alternatives and similar repositories for NEO

Users that are interested in NEO are comparing it to the libraries listed below

Sorting: