opea-project / docs
This repo contains documents of the OPEA project
☆27Updated this week
Related projects ⓘ
Alternatives and complementary repositories for docs
- Evaluation, benchmark, and scorecard, targeting for performance on throughput and latency, accuracy on popular evaluation harness, safety…☆22Updated this week
- GenAI components at micro-service level; GenAI service composer to create mega-service☆76Updated this week
- Generative AI Examples is a collection of GenAI examples such as ChatQnA, Copilot, which illustrate the pipeline capabilities of the Open…☆277Updated this week
- Large Language Model Text Generation Inference on Habana Gaudi☆27Updated this week
- A collection of YAML files, Helm Charts, Operator code, and guides to act as an example reference implementation for NVIDIA NIM deploymen…☆139Updated this week
- Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)☆153Updated this week
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆103Updated last week
- IBM development fork of https://github.com/huggingface/text-generation-inference☆57Updated last month
- ☆120Updated this week
- ☆47Updated 2 months ago
- Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs☆165Updated 2 weeks ago
- Benchmark suite for LLMs from Fireworks.ai☆58Updated 2 weeks ago
- Containerization and cloud native suite for OPEA☆30Updated this week
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆89Updated this week
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆57Updated this week
- Reference models for Intel(R) Gaudi(R) AI Accelerator☆155Updated 2 weeks ago
- Tune efficiently any LLM model from HuggingFace using distributed training (multiple GPU) and DeepSpeed. Uses Ray AIR to orchestrate the …☆53Updated last year
- Your buddy in the (L)LM space.☆63Updated 2 months ago
- Packages and instructions for training and inference of LLMs on NVIDIA's new GH200 machines☆19Updated 2 months ago
- Iterate fast on your RAG pipelines☆16Updated 3 weeks ago
- Accelerate your Gen AI with NVIDIA NIM and NVIDIA AI Workbench☆107Updated this week
- ☆35Updated this week
- One click templates for inferencing Language Models☆120Updated this week
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆57Updated 2 months ago
- ☆55Updated this week
- 🏋️ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of O…☆257Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆253Updated last month
- Production ready LLM model compression/quantization toolkit with accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.☆125Updated this week
- Run Generative AI models with simple C++/Python API and using OpenVINO Runtime☆153Updated this week
- GroqFlow provides an automated tool flow for compiling machine learning and linear algebra workloads into Groq programs and executing tho…☆100Updated 3 weeks ago