foundation-model-stack / fm-training-estimatorLinks
Estimate resources needed to train LLMs
☆13Updated 7 months ago
Alternatives and similar repositories for fm-training-estimator
Users that are interested in fm-training-estimator are comparing it to the libraries listed below
Sorting:
- 🚀 Collection of tuning recipes with HuggingFace SFTTrainer and PyTorch FSDP.☆49Updated last week
- Create and deploy virtual-experiments - co-processing computational workflows☆10Updated 2 months ago
- llm-d benchmark scripts and tooling☆28Updated this week
- Bridge operator repo☆21Updated last week
- InstructLab Training Library - Efficient Fine-Tuning with Message-Format Data☆42Updated this week
- ☆49Updated last month
- ☆12Updated 3 weeks ago
- Python library for Synthetic Data Generation☆50Updated this week
- Cloud Native Benchmarking of Foundation Models☆42Updated last month
- Community maintained hardware plugin for vLLM on Spyre☆33Updated this week
- ☆23Updated 3 years ago
- MLPerf™ logging library☆37Updated this week
- A tool to detect infrastructure issues on cloud native AI systems☆47Updated last week
- Helm charts for llm-d☆50Updated 2 months ago
- Trusted Service Identity is closing the gap of preventing access to secrets by an untrusted operator during the process of obtaining auth…☆27Updated last week
- IBM development fork of https://github.com/huggingface/text-generation-inference☆61Updated last week
- Simplifying the definition and execution, scaling and deployment of pipelines on the cloud.☆233Updated 2 years ago
- A top-like tool for monitoring GPUs in a cluster☆85Updated last year
- ☆254Updated last week
- Caikit is an AI toolkit that enables users to manage models through a set of developer friendly APIs.☆107Updated 2 months ago
- ☆40Updated this week
- This repository contains the results and code for the MLPerf™ Training v4.0 benchmark.☆12Updated last year
- Module, Model, and Tensor Serialization/Deserialization☆267Updated last month
- CLI for the Serverless Supercomputer☆24Updated last week
- Repository for open inference protocol specification☆59Updated 4 months ago
- Cray-HPE System Management Documentation for Shasta, High-Performance-Computing-as-a-Service (HPCaaS).☆29Updated this week
- Integrations between commercial and open source applications and LSF published by IBM and others.☆16Updated last year
- Optimizing loading training data from cloud bucket storage for cloud-based distributed deep learning. Official repository for Quantifying…☆12Updated 3 years ago
- 🚀 Collection of libraries used with fms-hf-tuning to accelerate fine-tuning and training of large models.☆11Updated this week
- InstructLab Community wide collaboration space including contributing, security, code of conduct, etc☆91Updated last week