intel / ipex-llmLinks
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, DeepSpeed, Axolotl, etc.
☆8,216Updated this week
Alternatives and similar repositories for ipex-llm
Users that are interested in ipex-llm are comparing it to the libraries listed below
Sorting:
- Distributed Tensorflow, Keras and PyTorch on Apache Spark/Flink & Ray☆24Updated 5 years ago
- BigDL: Distributed TensorFlow, Keras and PyTorch on Apache Spark/Flink & Ray☆2,684Updated last week
- TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.☆3,868Updated 2 years ago
- A Python package for extending the official PyTorch that can easily obtain performance on Intel platform☆1,931Updated this week
- Accelerate LLM with low-bit (FP4 / INT4 / FP8 / INT8) optimizations using ipex-llm☆166Updated 3 months ago
- Open deep learning compiler stack for cpu, gpu and specialized accelerators☆12,521Updated this week
- Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.☆14,564Updated 2 weeks ago
- Simple and Distributed Machine Learning☆5,159Updated this week
- oneAPI Deep Neural Network Library (oneDNN)☆3,863Updated this week
- LMDeploy is a toolkit for compressing, deploying, and serving LLMs.☆6,854Updated this week
- Distributed Deep learning with Keras & Spark☆1,572Updated 2 years ago
- TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows…☆2,266Updated last year
- The Triton Inference Server provides an optimized cloud and edge inferencing solution.☆9,614Updated last week
- Alluxio, data orchestration for analytics and machine learning in the cloud☆7,055Updated 3 months ago
- A flexible, high-performance serving system for machine learning models☆6,311Updated this week
- High-speed Large Language Model Serving for Local Deployment☆8,304Updated 2 weeks ago
- An open source ML system for the end-to-end data science lifecycle☆1,057Updated this week
- Ongoing research training transformer models at scale☆13,130Updated this week
- ⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Pl…☆2,170Updated 10 months ago
- A Flexible and Powerful Parameter Server for large-scale machine learning☆6,769Updated last week
- TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizati…☆11,313Updated this week
- ☆1,657Updated 6 years ago
- Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics☆15,811Updated this week
- OpenVINO™ is an open source toolkit for optimizing and deploying AI inference☆8,698Updated this week
- PredictionIO, a machine learning server for developers and ML engineers.☆12,530Updated 4 years ago
- Transformer related optimization, including BERT, GPT☆6,270Updated last year
- ⚠️DirectML is in maintenance mode ⚠️ DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning. Direct…☆2,500Updated this week
- Code to accompany Advanced Analytics with Spark from O'Reilly Media☆1,530Updated 10 months ago
- LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalabili…☆3,480Updated this week
- MLeap: Deploy ML Pipelines to Production☆1,519Updated 8 months ago