Accelerate LLM with low-bit (FP4 / INT4 / FP8 / INT8) optimizations using ipex-llm
☆170May 4, 2026Updated 2 weeks ago
Alternatives and similar repositories for ipex-llm-tutorial
Users that are interested in ipex-llm-tutorial are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V,…☆8,805Jan 28, 2026Updated 3 months ago
- 本项目借助飞桨平台,构建起一套创新的多模型协同系统,实现 PDF 文件到 Markdown 文件的高效、精准转换。☆27Mar 25, 2025Updated last year
- This sample shows how to use the oneAPI Video Processing Library (oneVPL) to perform a single and multi-source video decode and preproces…☆15Jun 15, 2023Updated 2 years ago
- 🤗 Optimum Intel: Accelerate inference with Intel optimization tools☆585May 15, 2026Updated last week
- ☆13Oct 28, 2020Updated 5 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- 📚 Jupyter notebook tutorials for OpenVINO™☆3,141May 13, 2026Updated last week
- This is Microsoft-Phi-3-NvidiaNIMWorkshop☆22Aug 16, 2024Updated last year
- [ICML2025] KVTuner: Sensitivity-Aware Layer-wise Mixed Precision KV Cache Quantization for Efficient and Nearly Lossless LLM Inference☆28Jan 27, 2026Updated 3 months ago
- Multi-Agent LLM System for Digital Scam Protection☆15Dec 19, 2024Updated last year
- Image Search Engine with HuggingFace Sentence Transformer☆12Aug 31, 2023Updated 2 years ago
- PyTorch code for our paper "Progressive Binarization with Semi-Structured Pruning for LLMs"☆13Mar 11, 2026Updated 2 months ago
- Code Repository for Blog - How to Productionize Large Language Models (LLMs)☆12Mar 27, 2024Updated 2 years ago
- xeCJK使用范例说明解析☆14Feb 27, 2020Updated 6 years ago
- Playing with io_uring in Zig☆17May 14, 2026Updated last week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆20May 28, 2025Updated 11 months ago
- OpenVINO LLM Benchmark☆11Dec 7, 2023Updated 2 years ago
- This repository contains resources, documentation and artifacts describing LLM agents☆15Jan 22, 2025Updated last year
- In this course navigates through the LLMOps pipeline, enabling you to preprocess training data for supervised fine-tuning and deploy cust…☆15Feb 13, 2024Updated 2 years ago
- 海康 SDK C# wrapper☆10Jul 30, 2019Updated 6 years ago
- Row-wise block scaling for fp8 quantization matrix multiplication. Solution to GPU mode AMD challenge.☆19Feb 9, 2026Updated 3 months ago
- ☆14Apr 22, 2024Updated 2 years ago
- Modified Beam Search with periodical restart☆12Sep 12, 2024Updated last year
- official implementation of the paper "Delving into Latent Spectral Biasing of Video VAEs for Superior Diffusability".☆56Dec 25, 2025Updated 4 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A Python package for extending the official PyTorch that can easily obtain performance on Intel platform☆2,014Mar 30, 2026Updated last month
- 基于 ZeroMQ 封装的进程间通信库,支持按 Topic 过滤的发布订阅模式和 RPC 模式通信☆15Feb 2, 2023Updated 3 years ago
- ☆17Jan 30, 2024Updated 2 years ago
- A modern, single-page web chat interface for local LLMs (Large Language Models), inspired by the visual style and UX of Anthropic's Claud…☆32May 11, 2025Updated last year
- OData minimal api proof of concept☆10Mar 1, 2023Updated 3 years ago
- With OpenVINO Test Drive, users can run large language models (LLMs) and models trained by Intel Geti on their devices, including AI PCs …☆37Mar 12, 2026Updated 2 months ago
- A simple no-install web UI for Ollama and OAI-Compatible APIs!☆31Jan 30, 2025Updated last year
- Autonomous AI orchestration architecture combining Google Antigravity with Jules API for hands-free development workflows. MCP integratio…☆38Apr 1, 2026Updated last month
- 智能证件照☆14Feb 18, 2022Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Tools for easier OpenVINO development/debugging☆10Jul 16, 2025Updated 10 months ago
- An open-source tool created by OctoML that converts TVM-optimized models to code runnable in ONNX Runtime.☆17Mar 30, 2023Updated 3 years ago
- Official implementation "ChipNet: Budget-Aware Pruning with Heaviside Continuous Approximations"☆21Oct 29, 2022Updated 3 years ago
- Synthetic data for fine tuning LLM☆27Dec 26, 2024Updated last year
- CVPR 2024 Research Paper with Code☆48Jun 28, 2024Updated last year
- Building reliable Retrieval Augmented Generation(RAG) AI Architecture☆13Jul 30, 2024Updated last year
- Extension based on VSCode editor☆15Apr 7, 2023Updated 3 years ago