OpenVINO-dev-contest / llama2.openvino
โ44Updated this week
Related projects: โ
- Run Generative AI models using native OpenVINO C++ APIโ107Updated this week
- ๐ค Optimum Intel: Accelerate inference with Intel optimization toolsโ380Updated this week
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMsโ87Updated this week
- An innovative library for efficient LLM inference via low-bit quantizationโ342Updated 2 weeks ago
- Easy and lightning fast training of ๐ค Transformers on Habana Gaudi processor (HPU)โ144Updated this week
- Advanced Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for tโฆโ205Updated this week
- AMD related optimizations for transformer modelsโ46Updated this week
- Python bindings for ggmlโ125Updated 2 weeks ago
- โ110Updated 4 months ago
- Pretrain, finetune and serve LLMs on Intel platforms with Rayโ95Updated this week
- A general 2-8 bits quantization toolbox with GPTQ/AWQ/HQQ, and export to onnx/onnx-runtime easily.โ141Updated 3 weeks ago
- A curated list of OpenVINO based AI projectsโ92Updated 3 weeks ago
- ๐น๏ธ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.โ129Updated last month
- ๐๏ธ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Oโฆโ231Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMsโ250Updated this week
- LLaMa/RWKV onnx models, quantization and testcaseโ345Updated last year
- โ170Updated this week
- Intelยฎ Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Noteโฆโ56Updated 3 weeks ago
- This repository contains Dockerfiles, scripts, yaml files, Helm charts, etc. used to scale out AI containers with versions of TensorFlow โฆโ23Updated this week
- Self-host LLMs with vLLM and BentoMLโ62Updated this week
- The no-code AI toolchainโ63Updated last week
- โ17Updated this week
- Dynamic batching library for Deep Learning inference. Tutorials for LLM, GPT scenarios.โ81Updated last month
- Large Language Model Text Generation Inference on Habana Gaudiโ24Updated last week
- llama.cpp clone with additional SOTA quants and improved CPU performance