gnovack / distributed-training-and-deepspeedLinks
☆17Updated 2 years ago
Alternatives and similar repositories for distributed-training-and-deepspeed
Users that are interested in distributed-training-and-deepspeed are comparing it to the libraries listed below
Sorting:
- ☆121Updated last year
- Various transformers for FSDP research☆38Updated 3 years ago
- Cold Compress is a hackable, lightweight, and open-source toolkit for creating and benchmarking cache compression methods built on top of…☆146Updated last year
- 🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.☆217Updated this week
- Torch Distributed Experimental☆117Updated last year
- Simple implementation of Speculative Sampling in NumPy for GPT-2.☆98Updated 2 years ago
- experiments with inference on llama☆103Updated last year
- A place to store reusable transformer components of my own creation or found on the interwebs☆62Updated last week
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day☆257Updated 2 years ago
- Implementation of a Transformer, but completely in Triton☆277Updated 3 years ago
- ☆122Updated last year
- Context Manager to profile the forward and backward times of PyTorch's nn.Module☆83Updated 2 years ago
- 🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…☆271Updated this week
- minimal pytorch implementation of bm25 (with sparse tensors)