foundation-model-stack / fm-training-estimatorLinks

Estimate resources needed to train LLMs

☆13

Alternatives and similar repositories for fm-training-estimator

Users that are interested in fm-training-estimator are comparing it to the libraries listed below

Sorting:

foundation-model-stack / fms-hf-tuning
🚀 Collection of tuning recipes with HuggingFace SFTTrainer and PyTorch FSDP.
☆47Updated this week
llm-d / llm-d-benchmark
llm-d benchmark scripts and tooling
☆21Updated this week
st4sd / st4sd-runtime-core
Create and deploy virtual-experiments - co-processing computational workflows
☆10Updated 3 weeks ago
instructlab / training
InstructLab Training Library - Efficient Fine-Tuning with Message-Format Data
☆42Updated this week
openshift-psap / llm-load-test
☆47Updated last week
instructlab / sdg
Python library for Synthetic Data Generation
☆42Updated this week
caikit / caikit-nlp
☆12Updated this week
fmperf-project / fmperf
Cloud Native Benchmarking of Foundation Models
☆39Updated last week
IBM / text-generation-inference
IBM development fork of https://github.com/huggingface/text-generation-inference
☆61Updated 3 months ago
IBM / Bridge-Operator
Bridge operator repo
☆21Updated 3 months ago
run-ai / runai-model-streamer
☆232Updated this week
IBM / autopilot
A tool to detect infrastructure issues on cloud native AI systems
☆44Updated 2 weeks ago
foundation-model-stack / fms-acceleration
🚀 Collection of libraries used with fms-hf-tuning to accelerate fine-tuning and training of large models.
☆11Updated last month
project-codeflare / codeflare-transfer-learning
☆23Updated 3 years ago
instructlab / eval
Python library for Evaluation
☆15Updated this week
project-codeflare / codeflare
Simplifying the definition and execution, scaling and deployment of pipelines on the cloud.
☆233Updated last year
coreweave / ml-containers
☆38Updated this week
groq / mlagility
Machine Learning Agility (MLAgility) benchmark and benchmarking tools
☆39Updated last week
nicknochnack / ACPxMCPxWatsonx
How to build an ACP compliant agent that uses MCP as well!
☆11Updated 3 months ago
run-ai / rntop
A top-like tool for monitoring GPUs in a cluster
☆85Updated last year
open-lm-engine / lm-engine
LM engine is a library for pretraining/finetuning LLMs
☆61Updated this week
fw-ai / benchmark
Benchmark suite for LLMs from Fireworks.ai
☆76Updated last week
determined-ai / determined-examples
Example ML projects that use the Determined library.
☆32Updated 10 months ago
vllm-project / vllm-spyre
Community maintained hardware plugin for vLLM on Spyre
☆30Updated last week
huggingface / tgi-gaudi
Large Language Model Text Generation Inference on Habana Gaudi
☆34Updated 4 months ago
llm-d / llm-d-deployer
Helm charts for llm-d
☆51Updated 2 weeks ago
mlcommons / training_results_v4.0
This repository contains the results and code for the MLPerf™ Training v4.0 benchmark.
☆12Updated last year
instructlab / instructlab-bot
GitHub bot to assist with the taxonomy contribution workflow
☆17Updated 9 months ago
coreweave / nccl-tests
NVIDIA NCCL Tests for Distributed Training
☆102Updated 2 weeks ago
at-aaims / OpenMxP
This is the open source version of HPL-MXP. The code performance has been verified on Frontier
☆17Updated last month