akx / ollama-dlLinks
Download models from the Ollama library, without Ollama
☆96Updated 10 months ago
Alternatives and similar repositories for ollama-dl
Users that are interested in ollama-dl are comparing it to the libraries listed below
Sorting:
- Tool to download models from Huggingface Hub and convert them to GGML/GGUF for llama.cpp☆159Updated 4 months ago
- LLM Benchmark for Throughput via Ollama (Local LLMs)☆291Updated last month
- Review/Check GGUF files and estimate the memory usage and maximum tokens per second.☆204Updated 3 weeks ago
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆64Updated last year
- Dagger functions to import Hugging Face GGUF models into a local ollama instance and optionally push them to ollama.com.☆116Updated last year
- An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.☆259Updated 6 months ago
- VSCode AI coding assistant powered by self-hosted llama.cpp endpoint.☆183Updated 7 months ago
- Code execution utilities for Open WebUI & Ollama☆296Updated 10 months ago
- automatically quant GGUF models☆199Updated this week
- LM inference server implementation based on *.cpp.☆274Updated 3 weeks ago
- LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.☆129Updated 2 years ago
- ☆98Updated 3 weeks ago
- Link you Ollama models to LM-Studio☆141Updated last year
- ☆209Updated this week
- LLM inference in C/C++☆102Updated 2 weeks ago
- A simple Jupyter Notebook for learning MLX text-completion fine-tuning!☆121Updated 10 months ago
- Lightweight Inference server for OpenVINO☆210Updated this week
- Web UI for ExLlamaV2☆513Updated 7 months ago
- Smart proxy for LLM APIs that enables model-specific parameter control, automatic mode switching (like Qwen3's /think and /no_think), and…☆50Updated 3 months ago
- A platform to self-host AI on easy mode☆161Updated last week
- klmbr - a prompt pre-processing technique to break through the barrier of entropy while generating text with LLMs☆80Updated 11 months ago
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆95Updated 2 months ago
- FastMLX is a high performance production ready API to host MLX models.☆326Updated 5 months ago
- ☆264Updated 3 months ago
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆82Updated last week
- Something similar to Apple Intelligence?☆61Updated last year
- Gemma 2 optimized for your local machine.☆375Updated last year
- A fast batching API to serve LLM models☆187Updated last year
- Phi-3.5 for Mac: Locally-run Vision and Language Models for Apple Silicon☆273Updated last year
- This small API downloads and exposes access to NeuML's txtai-wikipedia and full wikipedia datasets, taking in a query and returning full …☆100Updated 2 weeks ago