intel / ipex-llm-tutorialLinks
Accelerate LLM with low-bit (FP4 / INT4 / FP8 / INT8) optimizations using ipex-llm
☆168Updated 4 months ago
Alternatives and similar repositories for ipex-llm-tutorial
Users that are interested in ipex-llm-tutorial are comparing it to the libraries listed below
Sorting:
- ☆428Updated this week
- LLM Inference benchmark☆426Updated last year
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆264Updated last month
- ☆353Updated this week
- ☆174Updated this week
- Triton Documentation in Chinese Simplified / Triton 中文文档☆82Updated 5 months ago
- LLM101n: Let's build a Storyteller 中文版☆132Updated last year
- ☆50Updated 10 months ago
- LLM 推理服务性能测试☆44Updated last year
- 支持中文场景的的小语言模型 llama2.c-zh☆149Updated last year
- run DeepSeek-R1 GGUFs on KTransformers☆251Updated 6 months ago
- Ascend PyTorch adapter (torch_npu). Mirror of https://gitee.com/ascend/pytorch☆432Updated last week
- C++ implementation of Qwen-LM☆605Updated 9 months ago
- LLM/MLOps/LLMOps☆114Updated 3 months ago
- pretrain a wiki llm using transformers☆51Updated last year
- llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deploy…☆86Updated last year
- vLLM Documentation in Chinese Simplified / vLLM 中文文档☆102Updated 2 weeks ago
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆132Updated 2 weeks ago
- Run generative AI models in sophgo BM1684X/BM1688☆240Updated last week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆62Updated 10 months ago
- a lightweight LLM model inference framework☆739Updated last year
- MindSpore online courses: Step into LLM☆477Updated last month
- ☆319Updated 2 months ago
- Community maintained hardware plugin for vLLM on Ascend☆1,128Updated this week
- 通义千问VLLM推理部署DEMO☆603Updated last year
- Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).☆248Updated last year
- Phi2-Chinese-0.2B 从0开始训练自己的Phi2中文小模型, 支持接入langchain加载本地知识库做检索增强生成RAG。Training your own Phi2 small chat model from scratch.☆566Updated last year
- Low-bit LLM inference on CPU/NPU with lookup table☆857Updated 3 months ago
- ☆497Updated last week
- 尝试自己从头写一个LLM,参考llama和nanogpt☆65Updated last year