FMInference / FlexLLMGen

Running large language models on a single GPU for throughput-oriented scenarios.
9,254Updated 2 months ago

Alternatives and similar repositories for FlexLLMGen:

Users that are interested in FlexLLMGen are comparing it to the libraries listed below