di37 / LLM-Load-Unload-Ollama

This is a simple demonstration to show how to keep an LLM loaded for prolonged time in the memory or unloading the model immediately after inferencing when using it via Ollama.
13Updated 4 months ago

Related projects: