intel / ipex-llmLinks
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, DeepSpeed, Axolotl, etc.
☆8,327Updated 2 weeks ago
Alternatives and similar repositories for ipex-llm
Users that are interested in ipex-llm are comparing it to the libraries listed below
Sorting:
- Distributed Tensorflow, Keras and PyTorch on Apache Spark/Flink & Ray☆24Updated 5 years ago
- BigDL: Distributed TensorFlow, Keras and PyTorch on Apache Spark/Flink & Ray☆2,685Updated 2 weeks ago
- TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.☆3,872Updated 2 years ago
- A Python package for extending the official PyTorch that can easily obtain performance on Intel platform☆1,961Updated this week
- SGLang is a fast serving framework for large language models and vision language models.☆18,283Updated this week
- Apache Spark - A unified analytics engine for large-scale data processing☆41,960Updated this week
- High-speed Large Language Model Serving for Local Deployment☆8,334Updated last month
- A high-throughput and memory-efficient inference and serving engine for LLMs☆58,573Updated this week
- Large Language Model Text Generation Inference☆10,527Updated last week
- A fast inference library for running LLMs locally on modern consumer-class GPUs☆4,326Updated last month
- Alluxio, data orchestration for analytics and machine learning in the cloud☆7,072Updated 4 months ago
- An open source ML system for the end-to-end data science lifecycle☆1,062Updated 3 weeks ago
- Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.☆39,053Updated this week
- Notes talking about the design and implementation of Apache Spark☆5,340Updated last year
- LMDeploy is a toolkit for compressing, deploying, and serving LLMs.☆7,079Updated this week
- The open source developer platform to build AI/LLM applications and models with confidence. Enhance your AI applications with end-to-end …☆22,194Updated this week
- Open deep learning compiler stack for cpu, gpu and specialized accelerators☆12,643Updated this week
- Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.☆14,593Updated last week
- Tensor library for machine learning☆13,195Updated this week
- MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle☆3,676Updated last week
- A blazing fast inference solution for text embeddings models☆4,027Updated this week
- Universal LLM Deployment Engine with ML Compilation☆21,390Updated this week
- TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizati…☆11,625Updated this week
- H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random F…☆7,291Updated this week
- The Triton Inference Server provides an optimized cloud and edge inferencing solution.☆9,806Updated this week
- Distributed Deep learning with Keras & Spark☆1,574Updated 2 years ago
- Build and run containers leveraging NVIDIA GPUs☆3,672Updated this week
- State of the Art Natural Language Processing☆4,043Updated last week
- Distributed deep learning on Hadoop and Spark clusters.☆1,259Updated 5 years ago
- NumPy & SciPy for GPU☆10,502Updated last week