graphcore / tutorials
Training material for IPU users: tutorials, feature examples, simple applications
☆86Updated 2 years ago
Alternatives and similar repositories for tutorials
Users that are interested in tutorials are comparing it to the libraries listed below
Sorting:
- PyTorch interface for the IPU☆179Updated last year
- Example code and applications for machine learning on Graphcore IPUs☆323Updated last year
- Poplar Advanced Runtime for the IPU☆7Updated last year
- Poplar libraries☆119Updated last year
- TensorFlow for the IPU☆78Updated last year
- Research and development for optimizing transformers☆126Updated 4 years ago
- Memory Optimizations for Deep Learning (ICML 2023)☆64Updated last year
- Blazing fast training of 🤗 Transformers on Graphcore IPUs☆85Updated last year
- Fast sparse deep learning on CPUs☆53Updated 2 years ago
- A schedule language for large model training☆146Updated 10 months ago
- ☆68Updated last month
- oneCCL Bindings for Pytorch*☆95Updated 2 weeks ago
- Graph algorithms for machine learning frameworks☆29Updated 2 years ago
- ☆104Updated 8 months ago
- 🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.☆194Updated this week
- Distributed preprocessing and data loading for language datasets☆39Updated last year
- ☆158Updated last year
- Fast low-bit matmul kernels in Triton☆297Updated this week
- A Data-Centric Compiler for Machine Learning☆82Updated last year
- PyTorch extension for emulating FP8 data formats on standard FP32 Xeon/GPU hardware.☆110Updated 5 months ago
- Torch Distributed Experimental☆115Updated 9 months ago
- Collection of kernels written in Triton language☆122Updated last month
- ☆117Updated last year
- SparseTIR: Sparse Tensor Compiler for Deep Learning☆136Updated 2 years ago
- Boosting 4-bit inference kernels with 2:4 Sparsity☆73Updated 8 months ago
- High-speed GEMV kernels, at most 2.7x speedup compared to pytorch baseline.☆106Updated 10 months ago
- FTPipe and related pipeline model parallelism research.☆41Updated last year
- Benchmark code for the "Online normalizer calculation for softmax" paper☆91Updated 6 years ago
- An efficient GPU support for LLM inference with x-bit quantization (e.g. FP6,FP5).☆248Updated 6 months ago
- Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity☆206Updated last year