High-speed and easy-use LLM serving framework for local deployment
☆149Aug 7, 2025Updated 9 months ago
Alternatives and similar repositories for PowerServe
Users that are interested in PowerServe are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Bamboo-7B Large Language Model☆93Mar 28, 2024Updated 2 years ago
- Self-implemented NN operators for Qualcomm's Hexagon NPU☆64Sep 30, 2025Updated 7 months ago
- ☆78Dec 16, 2025Updated 4 months ago
- ☆75Oct 6, 2023Updated 2 years ago
- YOLOv5在高通AI Engine Direct环境下进行QNN量化,CPU推理的项目☆17Sep 10, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Inference RWKV v5, v6 and v7 with Qualcomm AI Engine Direct SDK☆91Updated this week
- Run Chinese MobileBert model on SNPE.☆15May 19, 2023Updated 2 years ago
- [EMNLP Findings 2024] MobileQuant: Mobile-friendly Quantization for On-device Language Models☆68Sep 22, 2024Updated last year
- The open-source project for "Mandheling: Mixed-Precision On-Device DNN Training with DSP Offloading"[MobiCom'2022]☆19Aug 4, 2022Updated 3 years ago
- LLM inference in C/C++☆51Updated this week
- Mic-controlled mouse clicks☆17Oct 6, 2025Updated 7 months ago
- Fast Multimodal LLM on Mobile Devices☆1,484Apr 30, 2026Updated last week
- C++ implementations for various tokenizers (sentencepiece, tiktoken etc).☆49Apr 21, 2026Updated 2 weeks ago
- ☆23Updated this week
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- An fully autonomous agent that accesses the browser and performs tasks.☆18Apr 25, 2025Updated last year
- High-speed Large Language Model Serving for Local Deployment☆9,423Jan 24, 2026Updated 3 months ago
- Milk-V Duo. Access to Internet throw USB RNDIS connection to host machine☆16Jan 11, 2024Updated 2 years ago
- This repository is a read-only mirror of https://gitlab.arm.com/kleidi/kleidiai☆139Apr 27, 2026Updated last week
- Example apps for LeapSDK☆62Apr 30, 2026Updated last week
- Study materials collected while studying☆51Apr 16, 2022Updated 4 years ago
- Running Microsoft's BitNet inference framework via FastAPI, Uvicorn and Docker.☆38Jul 2, 2025Updated 10 months ago
- Create text chunks which end at natural stopping points without using a tokenizer☆26Nov 26, 2025Updated 5 months ago
- Project is intended to build and deploy an scene detection application onto Qualcomm Robotics development Kit (RB5) that detects whether …☆10Jun 26, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A sleek, customizable interface for managing LLMs with responsive design and easy agent personalization.☆17Aug 30, 2024Updated last year
- A Triton JIT runtime and ffi provider in C++☆33Apr 28, 2026Updated last week
- ☆21Oct 2, 2024Updated last year
- ☆43Mar 29, 2025Updated last year
- A powerful and user-friendly tool that generates detailed captions for your images☆21Nov 11, 2024Updated last year
- Experimental interface environment for open source LLM, designed to democratize the use of AI. Powered by llama-cpp, llama-cpp-python and…☆18Oct 11, 2025Updated 6 months ago
- ☆13Jan 7, 2025Updated last year
- 2022 Chcore Lab☆49Jun 7, 2022Updated 3 years ago
- matmul using AMX instructions☆24May 7, 2024Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- the original reference implementation of a specified llama.cpp backend for Qualcomm Hexagon NPU on Android phone, https://github.com/ggml…☆41Jul 14, 2025Updated 9 months ago
- Low-bit LLM inference on CPU/NPU with lookup table☆954Jun 5, 2025Updated 11 months ago
- ☆18Jan 27, 2025Updated last year
- MobiSys#114☆23Aug 17, 2023Updated 2 years ago
- a single-header math library☆17Nov 7, 2025Updated 5 months ago
- A thin cython wrapper around llama.cpp, whisper.cpp and stable-diffusion.cpp☆25Updated this week
- ☆188Apr 24, 2026Updated last week