intel / ipex-llm-tutorialLinks
Accelerate LLM with low-bit (FP4 / INT4 / FP8 / INT8) optimizations using ipex-llm
☆166Updated 3 months ago
Alternatives and similar repositories for ipex-llm-tutorial
Users that are interested in ipex-llm-tutorial are comparing it to the libraries listed below
Sorting:
- ☆427Updated 3 weeks ago
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆264Updated last week
- LLM Inference benchmark☆426Updated last year
- a lightweight LLM model inference framework☆736Updated last year
- ☆172Updated this week
- Low-bit LLM inference on CPU/NPU with lookup table☆838Updated 2 months ago
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆128Updated last week
- ☆335Updated last week
- RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.☆837Updated 2 weeks ago
- Community maintained hardware plugin for vLLM on Ascend☆977Updated last week
- ☆50Updated 9 months ago
- ☆479Updated last week
- LLM101n: Let's build a Storyteller 中文版☆132Updated last year
- Ascend PyTorch adapter (torch_npu). Mirror of https://gitee.com/ascend/pytorch☆406Updated this week
- FlagScale is a large model toolkit based on open-sourced projects.☆338Updated this week
- ☆318Updated last month
- C++ implementation of Qwen-LM☆607Updated 8 months ago
- Accelerate inference without tears☆322Updated 5 months ago
- Efficient AI Inference & Serving☆472Updated last year
- 支持中文场景的的小语言模型 llama2.c-zh☆149Updated last year
- LLM/MLOps/LLMOps☆105Updated 2 months ago
- ☆47Updated last year
- Triton Documentation in Chinese Simplified / Triton 中文文档☆78Updated 4 months ago
- A high-performance inference system for large language models, designed for production environments.☆459Updated 3 weeks ago
- Run generative AI models in sophgo BM1684X/BM1688☆233Updated this week
- ☆54Updated this week
- LLM 推理服务性能测试☆44Updated last year
- Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).☆246Updated last year
- ☆69Updated 9 months ago
- 一种任务级GPU算力分时调度的高性能深度学习训练平台☆690Updated last year