intel / ipex-llmLinks
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, DeepSpeed, Axolotl, etc.
☆8,669Updated last week
Alternatives and similar repositories for ipex-llm
Users that are interested in ipex-llm are comparing it to the libraries listed below
Sorting:
- BigDL: Distributed TensorFlow, Keras and PyTorch on Apache Spark/Flink & Ray☆2,692Updated 2 months ago
- TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.☆3,859Updated 2 years ago
- Accelerate LLM with low-bit (FP4 / INT4 / FP8 / INT8) optimizations using ipex-llm☆168Updated 9 months ago
- A Python package for extending the official PyTorch that can easily obtain performance on Intel platform☆2,010Updated this week
- SGLang is a high-performance serving framework for large language models and multimodal models.☆23,091Updated this week
- A flexible, high-performance serving system for machine learning models☆6,352Updated last month
- Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on sing…☆27,966Updated this week
- TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizat…☆12,811Updated this week
- Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.☆14,665Updated 2 months ago
- LMDeploy is a toolkit for compressing, deploying, and serving LLMs.☆7,576Updated this week
- An open source ML system for the end-to-end data science lifecycle☆1,079Updated last week
- Integration of TensorFlow with other open-source frameworks☆1,374Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆69,622Updated this week
- Open Machine Learning Compiler Framework☆13,096Updated this week
- Large Language Model Text Generation Inference☆10,749Updated last month
- Distributed deep learning on Hadoop and Spark clusters.☆1,262Updated 6 years ago
- Fast and memory-efficient exact attention☆22,113Updated this week
- Retrieval and Retrieval-augmented LLMs☆11,256Updated last month
- Inference Llama 2 in one file of pure C☆19,146Updated last year
- PyTorch native post-training library☆5,660Updated last week
- A scalable machine learning library on Apache Spark☆796Updated 4 years ago
- Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 20+ clouds, o…☆9,418Updated this week
- Distributed Deep learning with Keras & Spark☆1,578Updated 2 years ago
- On-device AI across mobile, embedded and edge for PyTorch☆4,226Updated this week
- Breeze is/was a numerical processing library for Scala.☆3,458Updated 4 months ago
- ⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Pl…☆2,174Updated last year
- AI PC starter app for doing AI image creation, image stylizing, and chatbot on a PC powered by an Intel® Arc™ GPU.☆735Updated this week
- JupyterLab computational environment.☆15,004Updated this week
- Step-by-step Deep Leaning Tutorials on Apache Spark using BigDL☆211Updated 3 years ago
- Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.☆4,677Updated last week