Accelerate LLM with low-bit (FP4 / INT4 / FP8 / INT8) optimizations using ipex-llm
☆169Apr 29, 2025Updated 10 months ago
Alternatives and similar repositories for ipex-llm-tutorial
Users that are interested in ipex-llm-tutorial are comparing it to the libraries listed below
Sorting:
- Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V,…☆8,694Jan 28, 2026Updated last month
- [ICML2025] KVTuner: Sensitivity-Aware Layer-wise Mixed Precision KV Cache Quantization for Efficient and Nearly Lossless LLM Inference☆26Jan 27, 2026Updated last month
- 本项目借助飞桨平台,构建起一套创新的多模型协同系统,实现 PDF 文件到 Markdown 文件的高效、精准转换。☆27Mar 25, 2025Updated 11 months ago
- Multi-Agent LLM System for Digital Scam Protection☆12Dec 19, 2024Updated last year
- Image Search Engine with HuggingFace Sentence Transformer☆12Aug 31, 2023Updated 2 years ago
- IBM Quantum Challenge Fall 2023☆10May 23, 2023Updated 2 years ago
- 🤗 Optimum Intel: Accelerate inference with Intel optimization tools☆537Feb 23, 2026Updated last week
- This sample shows how to use the oneAPI Video Processing Library (oneVPL) to perform a single and multi-source video decode and preproces…☆15Jun 15, 2023Updated 2 years ago
- Code Repository for Blog - How to Productionize Large Language Models (LLMs)☆12Mar 27, 2024Updated last year
- Projects completed under LinuxWorld Informatics Ltd. - MLOps Training.☆12Aug 15, 2020Updated 5 years ago
- An OpenAI API compatible images server to generate or manipulate images.☆17Feb 2, 2025Updated last year
- Building reliable Retrieval Augmented Generation(RAG) AI Architecture☆13Jul 30, 2024Updated last year
- ☆13Oct 28, 2020Updated 5 years ago
- ☆13Apr 22, 2024Updated last year
- ☆20Feb 18, 2025Updated last year
- Lab files of IBM's Qiskit Global Summer School 2020.☆17Sep 3, 2020Updated 5 years ago
- ☆17Dec 16, 2024Updated last year
- Official implementation "ChipNet: Budget-Aware Pruning with Heaviside Continuous Approximations"☆21Oct 29, 2022Updated 3 years ago
- Personal voice assistant, with voice interruption and Twilio support☆18Feb 24, 2025Updated last year
- Let AI live in small world by using LangChain☆20May 14, 2023Updated 2 years ago
- NLP/LLM Mlops Pipeline to dev/train/evaluation, scalable deploy and monitoring systems.☆22Mar 15, 2024Updated last year
- A Gradio Web UI for running local LLM on Intel GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max) using IPEX-LLM.☆18Updated this week
- With OpenVINO Test Drive, users can run large language models (LLMs) and models trained by Intel Geti on their devices, including AI PCs …☆37Dec 15, 2025Updated 2 months ago
- KoboldCpp Smart Launcher with GPU Layer and Tensor Override Tuning☆30May 18, 2025Updated 9 months ago
- [NOT MAINTAINED] Cython accelerated fANOVA implementation for Optuna.☆25Aug 16, 2024Updated last year
- A modern, single-page web chat interface for local LLMs (Large Language Models), inspired by the visual style and UX of Anthropic's Claud…☆29May 11, 2025Updated 9 months ago
- CVPR 2024 Research Paper with Code☆48Jun 28, 2024Updated last year
- This sample shows how to deploy ChatGLM3 using OpenVINO☆21Nov 18, 2024Updated last year
- An abstraction library for building domain-specific intelligent agents based on Large Language Models (LLMs). LLMAgent provides a core ar…☆27Feb 5, 2026Updated 3 weeks ago
- AI Agents with Google's Gemini Pro and Gemini Pro Vision Models☆28Jan 19, 2024Updated 2 years ago
- A simple no-install web UI for Ollama and OAI-Compatible APIs!☆31Jan 30, 2025Updated last year
- 🤓 A collection of AWESOME structured summaries of Large Language Models (LLMs)☆31Sep 7, 2023Updated 2 years ago
- A Framework for Narrative Agents☆37Sep 24, 2024Updated last year
- Repo for paper: "Paxion: Patching Action Knowledge in Video-Language Foundation Models" Neurips 23 Spotlight☆37May 23, 2023Updated 2 years ago
- Collection of Deep Reinforcement Learning Jupyter Notebooks. Each notebook is self-contained and presents single algorithm. These include…☆38Mar 7, 2020Updated 5 years ago
- MemVerge Netflow Plugin☆16Jun 24, 2025Updated 8 months ago
- Material for the series of seminars on Large Language Models☆34Apr 21, 2024Updated last year
- Suspension telemetry system for mountain bike or dirt bike☆10Apr 9, 2024Updated last year
- Intelligent Document Processing with AWS AI/ML, published by Packt☆12Feb 5, 2026Updated 3 weeks ago