chncwang / InsNet
InsNet Runs Instance-dependent Neural Networks with Padding-free Dynamic Batching.
☆66Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for InsNet
- A Fast Muti-processing BERT-Inference System☆100Updated 2 years ago
- AutodiffEngine☆13Updated 5 years ago
- Place for meetup slides☆140Updated 4 years ago
- Simple Dynamic Batching Inference☆145Updated 2 years ago
- pytorch源码阅读 0.2.0 版本☆88Updated 4 years ago
- oneflow documentation☆68Updated 4 months ago
- ☆123Updated 3 years ago
- OneFlow models for benchmarking.☆104Updated 3 months ago
- A small deep-learning framework with C++/Python/CUDA☆53Updated 6 years ago
- A simple deep learning framework that supports automatic differentiation and GPU acceleration.☆56Updated last year
- Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.☆29Updated 2 months ago
- My learning notes about AI, including Machine Learning and Deep Learning.☆18Updated 5 years ago
- Tutorial code on how to build your own Deep Learning System in 2k Lines☆126Updated 7 years ago
- ☆18Updated 3 years ago
- TensorFlow and TVM integration☆38Updated 4 years ago
- Transformer related optimization, including BERT, GPT☆60Updated last year
- Models and examples built with OneFlow☆96Updated last month
- implement bert in pure c++☆32Updated 4 years ago
- A super light-weight deep learning library based on NumPy in PyTorch fashion.☆93Updated 3 years ago
- DeepLearning Framework Performance Profiling Toolkit☆276Updated 2 years ago
- Inference framework for MoE layers based on TensorRT with Python binding☆41Updated 3 years ago
- Compiler Infrastructure for Neural Networks☆143Updated last year
- symmetric int8 gemm☆66Updated 4 years ago
- Dynamic Tensor Rematerialization prototype (modified PyTorch) and simulator. Paper: https://arxiv.org/abs/2006.09616☆129Updated last year
- Tutorials for writing high-performance GPU operators in AI frameworks.☆123Updated last year
- An unofficial cuda assembler, for all generations of SASS, hopefully :)☆78Updated last year
- Running BERT without Padding☆460Updated 2 years ago
- This is an implementation of sgemm_kernel on L1d cache.☆216Updated 8 months ago
- an automatic differentiation framework with dynamic graph/支持动态图的自动求导框架☆101Updated 4 years ago