intel-analytics / ipex-llm-tutorial

Accelerate LLM with low-bit (FP4 / INT4 / FP8 / INT8) optimizations using ipex-llm

☆152

Alternatives and similar repositories for ipex-llm-tutorial:

Users that are interested in ipex-llm-tutorial are comparing it to the libraries listed below

intel / xFasterTransformer
☆392Updated this week
modelscope / dash-infer
DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …
☆208Updated this week
ninehills / llm-inference-benchmark
LLM Inference benchmark
☆377Updated 5 months ago
owenliang / qwen-vllm
通义千问VLLM推理部署DEMO
☆496Updated 9 months ago
sophgo / ChatGLM2-TPU
run ChatGLM2-6B in BM1684X
☆49Updated 10 months ago
wangzhaode / llm-export
llm-export can export llm model to onnx.
☆255Updated last week
microsoft / T-MAC
Low-bit LLM inference on CPU with lookup table
☆646Updated last week
hyperai / vllm-cn
vLLM Documentation in Chinese Simplified / vLLM 中文文档
☆22Updated last week
wangshuai09 / vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
☆29Updated this week
FlagOpen / FlagPerf
FlagPerf is an open-source software platform for benchmarking AI chips.
☆317Updated 2 weeks ago
datawhalechina / llm-deploy
大模型/LLM推理和部署理论与实践
☆140Updated 2 weeks ago
SmartFlowAI / LLM101n-CN
LLM101n: Let's build a Storyteller 中文版
☆121Updated 5 months ago
QwenLM / qwen.cpp
C++ implementation of Qwen-LM
☆569Updated last month
hahnyuan / LLM-Viewer
Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline mod…
☆373Updated 4 months ago
OpenCSGs / llm-inference
llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deploy…
☆74Updated 8 months ago
pcg-mlp / KsanaLLM
☆302Updated 3 weeks ago
chenyangMl / llama2.c-zh
支持中文场景的的小语言模型 llama2.c-zh
☆145Updated 10 months ago
OpenBMB / MiniCPM-CookBook
This is a user guide for the MiniCPM and MiniCPM-V series of small language models (SLMs) developed by ModelBest. “面壁小钢炮” focuses on achi…
☆171Updated 2 months ago
alibaba / rtp-llm
RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.
☆582Updated 3 months ago
Ascend / pytorch
Ascend PyTorch adapter (torch_npu). Mirror of https://gitee.com/ascend/pytorch
☆290Updated this week
AI-Study-Han / Zero-Chatgpt
从0开始，将chatgpt的技术路线跑一遍。
☆184Updated 4 months ago
FlagOpen / FlagGems
FlagGems is an operator library for large language models implemented in Triton Language.
☆397Updated this week
MegEngine / InferLLM
a lightweight LLM model inference framework
☆712Updated 9 months ago
openvinotoolkit / openvino.genai
Run Generative AI models with simple C++/Python API and using OpenVINO Runtime
☆198Updated this week
charent / Phi2-mini-Chinese
Phi2-Chinese-0.2B 从0开始训练自己的Phi2中文小模型，支持接入langchain加载本地知识库做检索增强生成RAG。Training your own Phi2 small chat model from scratch.
☆516Updated 6 months ago
ModelTC / llmc
[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a V…
☆382Updated this week
hyperai / triton-cn
Triton Documentation in Chinese Simplified / Triton 中文文档
☆52Updated last week
modelscope / evalscope
A streamlined and customizable framework for efficient large model evaluation and performance benchmarking
☆357Updated this week
IEIT-Yuan / Yuan2.0-M32
Mixture-of-Experts (MoE) Language Model
☆184Updated 4 months ago
intel / llm-on-ray
Pretrain, finetune and serve LLMs on Intel platforms with Ray
☆108Updated 2 months ago