fairydreaming / distributed-llamaView on GitHub
Tensor parallelism is all you need. Run LLMs on an AI cluster at home using any device. Distribute the workload, divide RAM usage, and increase inference speed.
18Nov 11, 2024Updated last year

Alternatives and similar repositories for distributed-llama

Users that are interested in distributed-llama are comparing it to the libraries listed below

Sorting:

Are these results useful?