opea-project / docs
This repo contains documents of the OPEA project
☆28Updated this week
Alternatives and similar repositories for docs:
Users that are interested in docs are comparing it to the libraries listed below
- Evaluation, benchmark, and scorecard, targeting for performance on throughput and latency, accuracy on popular evaluation harness, safety…☆25Updated this week
- GenAI components at micro-service level; GenAI service composer to create mega-service☆87Updated this week
- IBM development fork of https://github.com/huggingface/text-generation-inference☆58Updated 3 weeks ago
- Generative AI Examples is a collection of GenAI examples such as ChatQnA, Copilot, which illustrate the pipeline capabilities of the Open…☆316Updated this week
- Large Language Model Text Generation Inference on Habana Gaudi☆29Updated this week
- Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)☆165Updated this week
- Benchmark suite for LLMs from Fireworks.ai☆64Updated last month
- A collection of YAML files, Helm Charts, Operator code, and guides to act as an example reference implementation for NVIDIA NIM deploymen…☆151Updated last week
- ☆150Updated this week
- Intel® AI for Enterprise RAG converts enterprise data into actionable insights with excellent TCO. Utilizing Intel Gaudi AI accelerators …☆12Updated 3 weeks ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆47Updated this week
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆108Updated 2 months ago
- ☆52Updated 4 months ago
- CloudAI Benchmark Framework☆47Updated this week
- Repository for open inference protocol specification☆45Updated 5 months ago
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆58Updated last month
- ☆41Updated last month
- Tune efficiently any LLM model from HuggingFace using distributed training (multiple GPU) and DeepSpeed. Uses Ray AIR to orchestrate the …☆53Updated last year
- Containerization and cloud native suite for OPEA☆33Updated this week
- Docker image NVIDIA GH200 machines - optimized for vllm serving and hf trainer finetuning☆23Updated 2 weeks ago
- Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs☆183Updated last month
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆74Updated this week
- 🏋️ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of O…☆280Updated last month
- PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"☆46Updated this week
- CUDA checkpoint and restore utility☆264Updated 9 months ago
- ☆33Updated this week
- Self-host LLMs with vLLM and BentoML☆79Updated this week
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆88Updated this week
- Setup and Installation Instructions for Habana binaries, docker image creation☆25Updated 3 weeks ago
- Cloud Native Benchmarking of Foundation Models☆21Updated 2 months ago