opea-project / Enterprise-InferenceLinks
Intel® AI for Enterprise Inference optimizes AI inference services on Intel hardware using Kubernetes Orchestration. It automates LLM model deployment for faster inference, resource provisioning, and optimal settings to simplify processes and reduce manual work.
☆23Updated this week
Alternatives and similar repositories for Enterprise-Inference
Users that are interested in Enterprise-Inference are comparing it to the libraries listed below
Sorting:
- A collection of YAML files, Helm Charts, Operator code, and guides to act as an example reference implementation for NVIDIA NIM deploymen…☆190Updated this week
- GenAI Studio is a low code platform to enable users to construct, evaluate, and benchmark GenAI applications. The platform also provide c…☆50Updated last month
- Accelerate your Gen AI with NVIDIA NIM and NVIDIA AI Workbench☆181Updated 4 months ago
- ☆13Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆15Updated this week
- Build Research and Rag agents with Granite on your laptop☆141Updated 2 weeks ago
- This NVIDIA RAG blueprint serves as a reference solution for a foundational Retrieval Augmented Generation (RAG) pipeline.☆266Updated last week
- Evaluation, benchmark, and scorecard, targeting for performance on throughput and latency, accuracy on popular evaluation harness, safety…☆37Updated last month
- GenAI components at micro-service level; GenAI service composer to create mega-service☆175Updated this week
- Route LLM requests to the best model for the task at hand.☆107Updated this week
- InstructLab Training Library - Efficient Fine-Tuning with Message-Format Data☆42Updated this week
- This repository contains Dockerfiles, scripts, yaml files, Helm charts, etc. used to scale out AI containers with versions of TensorFlow …☆52Updated this week
- DGXC Benchmarking provides recipes in ready-to-use templates for evaluating performance of specific AI use cases across hardware and soft…☆38Updated 2 weeks ago
- For individual users, watsonx Code Assistant can access a local IBM Granite model☆35Updated 3 months ago
- ☆90Updated last month
- Tutorials for running models on First-gen Gaudi and Gaudi2 for Training and Inference. The source files for the tutorials on https://dev…☆61Updated last week
- This repo contains documents of the OPEA project☆44Updated last month
- An NVIDIA AI Workbench example project for fine-tuning a Mistral 7B model☆61Updated last year
- ☆178Updated this week
- AI21 Python SDK☆68Updated last week
- GitHub bot to assist with the taxonomy contribution workflow☆17Updated 10 months ago
- Large Language Model Text Generation Inference on Habana Gaudi☆34Updated 6 months ago
- Machine Learning using oneAPI. Explores Intel Extensions for scikit-learn* and NumPy, SciPy, Pandas powered by oneAPI☆41Updated last year
- InstructLab Community wide collaboration space including contributing, security, code of conduct, etc☆91Updated last week
- Neo4j Extensions and Integrations with Vertex AI and LangChain☆27Updated 5 months ago
- A framework for fine-tuning retrieval-augmented generation (RAG) systems.☆130Updated this week
- This repository is a combination of llama workflows and agents together which is a powerful concept.☆17Updated last year
- ☆168Updated this week
- Explore our open source AI portfolio! Develop, train, and deploy your AI solutions with performance- and productivity-optimized tools fro…☆52Updated 6 months ago
- PARIS (Perpetual Adaptive Regenerative Intelligence System) is a conceptual model for building and managing effective AI and Language Mod…☆27Updated 2 years ago