intel-analytics / ipex-llm-tutorial
Accelerate LLM with low-bit (FP4 / INT4 / FP8 / INT8) optimizations using ipex-llm
☆145Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for ipex-llm-tutorial
- ☆380Updated last week
- MindSpore online courses: Step into LLM☆430Updated this week
- RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.☆545Updated last month
- LLM Inference benchmark☆350Updated 3 months ago
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆137Updated 2 months ago
- llm-export can export llm model to onnx.☆231Updated last week
- 模型压缩的小白入门教程☆198Updated this week
- ☆45Updated 7 months ago
- Run generative AI models in sophgo BM1684X☆125Updated this week
- TinyRAG☆236Updated 3 weeks ago
- Compare different hardware platforms via the Roofline Model for LLM inference tasks.☆75Updated 8 months ago
- Low-bit LLM inference on CPU with lookup table☆588Updated this week
- LLM全栈优质资源汇总☆370Updated this week
- 从0开始,将chatgpt的技术路线跑一遍。☆152Updated 2 months ago
- export llama to onnx☆97Updated 5 months ago
- run ChatGLM2-6B in BM1684X☆48Updated 8 months ago
- AIFoundation 主要是指AI系统遇到大模型,从底层到上层如何系统级地支持大模型训练和推理,全栈的核心技术。☆295Updated 2 months ago
- 通义千问VLLM推理部署DEMO☆445Updated 7 months ago
- ☆126Updated this week
- 支持中文场景的的小语言模型 llama2.c-zh☆146Updated 8 months ago
- 大模型/LLM推理和部署理论与实践☆82Updated this week
- Phi2-Chinese-0.2B 从0开始训练自己的Phi2中文小模型,支持接入langchain加载本地知识库做检索增强生成RAG。Training your own Phi2 small chat model from scratch.☆489Updated 4 months ago
- LLM101n: Let's build a Storyteller 中文版☆118Updated 3 months ago
- 从零实现一个小参数量中文大语言模型。☆280Updated 3 months ago
- Inference code for LLaMA models☆109Updated last year
- C++ implementation of Qwen-LM☆553Updated 10 months ago
- 从0到1构建一个MiniLLM (pretrain+sft+dpo实践中)☆329Updated 2 months ago
- llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deploy…☆70Updated 6 months ago
- 校招、秋招、春招、实习好项目,带你从零动手实现支持LLama2/3和Qwen2.5的大模型推理框架。☆228Updated 2 weeks ago
- LLaMa/RWKV onnx models, quantization and testcase☆353Updated last year