ShishirPatil / poetLinks
ML model training for edge devices
β166Updated last year
Alternatives and similar repositories for poet
Users that are interested in poet are comparing it to the libraries listed below
Sorting:
- π Interactive performance profiling and debugging tool for PyTorch neural networks.β64Updated 7 months ago
- Flexible simulator for mixed precision and format simulation of LLMs and vision transformers.β51Updated 2 years ago
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mindβ¦β161Updated 2 months ago
- Home for OctoML PyTorch Profilerβ114Updated 2 years ago
- β154Updated 2 years ago
- β252Updated last year
- β94Updated 3 years ago
- Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".β277Updated last year
- β120Updated last year
- β111Updated last year
- A schedule language for large model trainingβ149Updated 3 weeks ago
- PipeTransformer: Automated Elastic Pipelining for Distributed Training of Large-scale Models. ICML 2021β56Updated 4 years ago
- Memory Optimizations for Deep Learning (ICML 2023)β107Updated last year
- A Python library transfers PyTorch tensors between CPU and NVMeβ121Updated 9 months ago
- β159Updated 2 years ago
- Official code for "SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient"β143Updated last year
- GPTQ inference Triton kernelβ307Updated 2 years ago
- PB-LLM: Partially Binarized Large Language Modelsβ153Updated last year
- Training material for IPU users: tutorials, feature examples, simple applicationsβ87Updated 2 years ago
- Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees"β380Updated last year
- (NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.β40Updated 2 years ago
- Training neural networks in TensorFlow 2.0 with 5x less memoryβ134Updated 3 years ago
- AI and Memory Wallβ220Updated last year
- Advanced Ultra-Low Bitrate Compression Techniques for the LLaMA Family of LLMsβ110Updated last year
- Research and development for optimizing transformersβ130Updated 4 years ago
- A list of awesome neural symbolic papers.β48Updated 3 years ago
- Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsityβ220Updated last year
- [MLSys'24] Atom: Low-bit Quantization for Efficient and Accurate LLM Servingβ320Updated last year
- β74Updated 5 months ago
- SparseTIR: Sparse Tensor Compiler for Deep Learningβ138Updated 2 years ago