di37 / LLM-Load-Unload-Ollama

This is a simple demonstration to show how to keep an LLM loaded for prolonged time in the memory or unloading the model immediately after inferencing when using it via Ollama.
13Updated 6 months ago

Related projects

Alternatives and complementary repositories for LLM-Load-Unload-Ollama