determined-ai / determined-examplesLinks
Example ML projects that use the Determined library.
☆32Updated 10 months ago
Alternatives and similar repositories for determined-examples
Users that are interested in determined-examples are comparing it to the libraries listed below
Sorting:
- The source code of our work "Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models" [AISTATS …☆61Updated 9 months ago
- NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference☆66Updated 7 months ago
- ☆74Updated 4 months ago
- A parallel framework for training deep neural networks☆63Updated 4 months ago
- ☆120Updated last year
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry☆42Updated last year
- 🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.☆207Updated this week
- The Triton backend for the PyTorch TorchScript models.☆158Updated this week
- A collection of reproducible inference engine benchmarks☆32Updated 3 months ago
- ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)☆195Updated this week
- Make triton easier☆47Updated last year
- Benchmark suite for LLMs from Fireworks.ai☆76Updated last week
- Load compute kernels from the Hub☆220Updated last week
- Distributed preprocessing and data loading for language datasets☆39Updated last year
- ☆15Updated 4 months ago
- Easy and Efficient Quantization for Transformers☆198Updated last month
- Intel Gaudi's Megatron DeepSpeed Large Language Models for training☆13Updated 7 months ago
- ☆45Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆266Updated 9 months ago
- Boosting 4-bit inference kernels with 2:4 Sparsity☆80Updated 11 months ago
- CUDA and Triton implementations of Flash Attention with SoftmaxN.☆71Updated last year
- A safetensors extension to efficiently store sparse quantized tensors on disk☆142Updated this week
- Dynamic batching library for Deep Learning inference. Tutorials for LLM, GPT scenarios.☆102Updated 11 months ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆87Updated last week
- JORA: JAX Tensor-Parallel LoRA Library (ACL 2024)☆35Updated last year
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆324Updated 3 months ago
- IBM development fork of https://github.com/huggingface/text-generation-inference☆61Updated 3 months ago
- Train, tune, and infer Bamba model☆130Updated 2 months ago
- ☆107Updated 11 months ago
- A place to store reusable transformer components of my own creation or found on the interwebs☆59Updated last week