NEO-MLSys25 / NEOLinks

NEO is a LLM inference engine built to save the GPU memory crisis by CPU offloading
36Updated 3 months ago

Alternatives and similar repositories for NEO

Users that are interested in NEO are comparing it to the libraries listed below

Sorting: