FMInference / FlexLLMGenView on GitHub
Running large language models on a single GPU for throughput-oriented scenarios.
9,383Oct 28, 2024Updated last year

Alternatives and similar repositories for FlexLLMGen

Users that are interested in FlexLLMGen are comparing it to the libraries listed below

Sorting:

Are these results useful?