nebuly-ai / exploring-AI-optimizationLinks
Curated list of awesome material on optimization techniques to make artificial intelligence faster and more efficient π
β116Updated last year
Alternatives and similar repositories for exploring-AI-optimization
Users that are interested in exploring-AI-optimization are comparing it to the libraries listed below
Sorting:
- ML model training for edge devicesβ165Updated last year
- An open-source efficient deep learning framework/compiler, written in python.β704Updated last week
- Fine-tune an LLM to perform batch inference and online serving.β112Updated last month
- experiments with inference on llamaβ104Updated last year
- Context Manager to profile the forward and backward times of PyTorch's nn.Moduleβ83Updated last year
- πΉοΈ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.β137Updated 11 months ago
- β250Updated 11 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMsβ264Updated 8 months ago
- deep learning with pytorch lightningβ1Updated 8 months ago
- A Jax-based library for building transformers, includes implementations of GPT, Gemma, LlaMa, Mixtral, Whisper, SWin, ViT and more.β288Updated 10 months ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMsβ87Updated this week
- β199Updated last year
- Collection of kernels written in Triton languageβ132Updated 2 months ago
- git extension for {collaborative, communal, continual} model developmentβ213Updated 7 months ago
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Dayβ257Updated last year
- An open-source AutoML Library based on PyTorchβ307Updated 2 months ago
- Codes for paper "KNAS: Green Neural Architecture Search"β92Updated 3 years ago
- A collection of all available inference solutions for the LLMsβ90Updated 3 months ago
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mindβ¦β158Updated last week
- TitanML Takeoff Server is an optimization, compression and deployment platform that makes state of the art machine learning models accessβ¦β114Updated last year
- Fast low-bit matmul kernels in Tritonβ323Updated last week
- Implementation of a Transformer, but completely in Tritonβ269Updated 3 years ago
- π€ Trade any tensors over the networkβ30Updated last year
- A scalable & efficient active learning/data selection system for everyone.β214Updated 11 months ago
- Presents comprehensive benchmarks of XLA-compatible pre-trained models in Keras.β37Updated last year
- β29Updated 2 years ago
- ποΈ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Oβ¦β304Updated last month
- ML model optimization product to accelerate inference.β325Updated 3 weeks ago
- A tool to analyze and debug neural networks in pytorch. Use a GUI to traverse the computation graph and view the data from many differentβ¦β287Updated 6 months ago
- π Interactive performance profiling and debugging tool for PyTorch neural networks.β61Updated 5 months ago