akx / ollama-dlLinks
Download models from the Ollama library, without Ollama
☆86Updated 7 months ago
Alternatives and similar repositories for ollama-dl
Users that are interested in ollama-dl are comparing it to the libraries listed below
Sorting:
- Tool to download models from Huggingface Hub and convert them to GGML/GGUF for llama.cpp☆152Updated last month
- Smart proxy for LLM APIs that enables model-specific parameter control, automatic mode switching (like Qwen3's /think and /no_think), and…☆48Updated last month
- Extract structured data from local or remote LLM models☆42Updated last year
- LLM inference in C/C++☆77Updated this week
- ☆157Updated 11 months ago
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆67Updated last week
- ☆95Updated 6 months ago
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆64Updated last year
- ☆19Updated 4 months ago
- A simple speech-to-text and text-to-speech AI chatbot that can be run fully offline.☆45Updated last year
- LLM Benchmark for Throughput via Ollama (Local LLMs)☆244Updated 3 weeks ago
- Falcon LLM ggml framework with CPU and GPU support☆246Updated last year
- klmbr - a prompt pre-processing technique to break through the barrier of entropy while generating text with LLMs☆76Updated 9 months ago
- Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU a…☆43Updated 9 months ago
- automatically quant GGUF models☆184Updated last week
- ☆204Updated last month
- Local LLM Server with GPU and NPU Acceleration☆138Updated this week
- RetroChat is a powerful command-line interface for interacting with various AI language models. It provides a seamless experience for eng…☆76Updated 3 weeks ago
- Testing LLM reasoning abilities with lineage relationship quizzes.☆28Updated 3 months ago
- A stable, fast and easy-to-use inference library with a focus on a sync-to-async API☆45Updated 9 months ago
- ☆24Updated 5 months ago
- Something similar to Apple Intelligence?☆61Updated 11 months ago
- Serving LLMs in the HF-Transformers format via a PyFlask API☆71Updated 9 months ago
- Pybind11 bindings for Whisper.cpp☆58Updated 3 weeks ago
- Virtual environment stacks for Python☆258Updated this week
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆86Updated this week
- A simple Jupyter Notebook for learning MLX text-completion fine-tuning!☆119Updated 7 months ago
- ☆114Updated 6 months ago
- A guidance compatibility layer for llama-cpp-python☆35Updated last year
- Lightweight Inference server for OpenVINO☆187Updated last week