opea-project / docs
This repo contains documents of the OPEA project
☆21Updated this week
Related projects: ⓘ
- Evaluation, benchmark, and scorecard, targeting for performance on throughput and latency, accuracy on popular evaluation harness, safety…☆18Updated this week
- GenAI components at micro-service level; GenAI service composer to create mega-service☆46Updated this week
- Generative AI Examples is a collection of GenAI examples such as ChatQnA, Copilot, which illustrate the pipeline capabilities of the Open…☆220Updated this week
- IBM development fork of https://github.com/huggingface/text-generation-inference☆52Updated this week
- Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)☆144Updated this week
- Large Language Model Text Generation Inference on Habana Gaudi☆24Updated last week
- A collection of YAML files, Helm Charts, Operator code, and guides to act as an example reference implementation for NVIDIA NIM deploymen…☆120Updated last week
- Machine Learning Agility (MLAgility) benchmark and benchmarking tools☆37Updated last month
- ☆42Updated this week
- NIM Agent Blueprint for multimodal PDF extraction☆29Updated 2 weeks ago
- Reference models for Intel(R) Gaudi(R) AI Accelerator☆152Updated this week
- One click templates for inferencing Language Models☆97Updated last week
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆95Updated this week
- Advanced Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for t…☆205Updated this week
- This repository contains Dockerfiles, scripts, yaml files, Helm charts, etc. used to scale out AI containers with versions of TensorFlow …☆23Updated this week
- Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM☆416Updated this week
- Deploy and Scale LLM-based applications☆26Updated last year
- ☆64Updated 3 months ago
- Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs☆128Updated this week
- Tune efficiently any LLM model from HuggingFace using distributed training (multiple GPU) and DeepSpeed. Uses Ray AIR to orchestrate the …☆50Updated last year
- Quickly and securely turn any Linux box into a build and deployment assistant☆26Updated this week
- Tensor library for machine learning☆20Updated 10 months ago
- ☆38Updated this week
- Run Generative AI models using native OpenVINO C++ API☆107Updated this week
- Google TPU optimizations for transformers models☆62Updated this week
- ☆49Updated this week
- TitanML Takeoff Server is an optimization, compression and deployment platform that makes state of the art machine learning models access…☆113Updated 7 months ago
- 🏋️ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of O…☆231Updated last week
- ☆20Updated 8 months ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆87Updated this week