intel / ipex-llm-tutorialLinks
Accelerate LLM with low-bit (FP4 / INT4 / FP8 / INT8) optimizations using ipex-llm
☆164Updated last month
Alternatives and similar repositories for ipex-llm-tutorial
Users that are interested in ipex-llm-tutorial are comparing it to the libraries listed below
Sorting:
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆254Updated last week
- ☆427Updated this week
- LLM Inference benchmark☆419Updated 10 months ago
- ☆166Updated this week
- LLM101n: Let's build a Storyteller 中文版☆131Updated 9 months ago
- llm-export can export llm model to onnx.☆293Updated 4 months ago
- run ChatGLM2-6B in BM1684X☆49Updated last year
- llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deploy…☆81Updated last year
- 从0开始,将chatgpt的技术路线跑一遍。☆238Updated 9 months ago
- LLM/MLOps/LLMOps☆88Updated last week
- export llama to onnx☆124Updated 5 months ago
- FlagScale is a large model toolkit based on open-sourced projects.☆281Updated this week
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆127Updated last month
- C++ implementation of Qwen-LM☆588Updated 6 months ago
- Ascend PyTorch adapter (torch_npu). Mirror of https://gitee.com/ascend/pytorch☆371Updated this week
- vLLM Documentation in Chinese Simplified / vLLM 中文文档☆75Updated 3 weeks ago
- 通义千问VLLM推理部署DEMO☆580Updated last year
- ☆44Updated 7 months ago
- Deploying LLMs offline on the NVIDIA Jetson platform marks the dawn of a new era in embodied intelligence, where devices can function ind…☆96Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆37Updated 4 months ago
- Efficient, Flexible, and Highly Fault-Tolerant Model Service Management Based on SGLang☆53Updated 6 months ago
- Alpaca Chinese Dataset -- 中文指令微调数据集☆205Updated 8 months ago
- 尝试自己从头写一个LLM,参考llama和nanogpt☆62Updated last year
- FlagPerf is an open-source software platform for benchmarking AI chips.☆333Updated last week
- Community maintained hardware plugin for vLLM on Ascend☆721Updated this week
- Qwen1.5-SFT(阿里, Ali), Qwen_Qwen1.5-2B-Chat/Qwen_Qwen1.5-7B-Chat微调(transformers)/LORA(peft)/推理☆62Updated last year
- ☆332Updated 4 months ago
- Triton Documentation in Chinese Simplified / Triton 中文文档☆71Updated last month
- Run generative AI models in sophgo BM1684X/BM1688☆216Updated this week
- Materials for learning SGLang☆426Updated this week