foundation-model-stack / fm-training-estimator
Estimate resources needed to train LLMs
β13Updated 2 months ago
Alternatives and similar repositories for fm-training-estimator:
Users that are interested in fm-training-estimator are comparing it to the libraries listed below
- π Collection of tuning recipes with HuggingFace SFTTrainer and PyTorch FSDP.β41Updated this week
- InstructLab Training Library - Efficient Fine-Tuning with Message-Format Dataβ37Updated this week
- β12Updated last week
- Python library for Synthetic Data Generationβ41Updated last week
- β40Updated last month
- π Collection of libraries used with fms-hf-tuning to accelerate fine-tuning and training of large models.β9Updated this week
- IBM development fork of https://github.com/huggingface/text-generation-inferenceβ60Updated 4 months ago
- Cloud Native Benchmarking of Foundation Modelsβ32Updated 6 months ago
- Python library for Evaluationβ14Updated last week
- GitHub bot to assist with the taxonomy contribution workflowβ16Updated 6 months ago
- Caikit is an AI toolkit that enables users to manage models through a set of developer friendly APIs.β105Updated 7 months ago
- Tackle Data-intensive Validity Analyzerβ39Updated last year
- Simplifying the definition and execution, scaling and deployment of pipelines on the cloud.β232Updated last year
- Repository to demo GPU Sharing with Time Slicing, MPS, MIG and othersβ39Updated 6 months ago
- This repository contains resources, documentation and artifacts describing LLM agentsβ14Updated 3 months ago
- Create and deploy virtual-experiments - co-processing computational workflowsβ10Updated last month
- InstaSlice Operator facilitates slicing of accelerators using stable APIsβ35Updated this week
- KubeStellar - a flexible solution for multi-cluster configuration management for edge, multi-cloud, and hybrid cloudβ378Updated this week
- An intuitive, easy-to-use python interface for batch resource requesting, access, job submission, and observation. Simplifying the develoβ¦β27Updated 2 weeks ago
- Artifacts for the Distributed Workloads stack as part of ODHβ30Updated last week
- β19Updated last week
- Tutorials and demos related to move2kubeβ12Updated 2 months ago
- Optimized primitives for collective multi-GPU communicationβ9Updated 11 months ago
- Improve ROSA customer experience (and customer retention) by leveraging foundation models to do βgpt-chatβ style search of Red Hat custoβ¦β27Updated last year
- Python bindings for TrustyAI's explainability libraryβ16Updated last week
- Dolomite Engine is a library for pretraining/finetuning LLMsβ52Updated this week
- Oper8 is a framework for writing kubernetes operators in python. It implements many common patterns used by large cloud applications thatβ¦β17Updated last week
- Move2Kube is a command-line tool for automating creation of Infrastructure as code (IaC) artifacts. It has inbuilt support for creating Iβ¦β393Updated 2 months ago
- WIP. Veloce is a low-code Ray-based parallelization library that makes machine learning computation novel, efficient, and heterogeneous.β18Updated 2 years ago
- Model Server for Keplerβ27Updated this week