Oneflow-Inc / OneFlow-Pruning
[CVPR-2023] Towards Any Structural Pruning
☆16Updated last year
Alternatives and similar repositories for OneFlow-Pruning:
Users that are interested in OneFlow-Pruning are comparing it to the libraries listed below
- ☆13Updated last year
- IntLLaMA: A fast and light quantization solution for LLaMA☆18Updated last year
- ☆11Updated last year
- TVMScript kernel for deformable attention☆24Updated 3 years ago
- OneFlow->ONNX☆42Updated last year
- A set of examples around MegEngine☆31Updated last year
- ☆24Updated 2 years ago
- An object detection codebase based on MegEngine.☆28Updated 2 years ago
- A toolkit for developers to simplify the transformation of nn.Module instances. It's now corresponding to Pytorch.fx.☆13Updated last year
- Trans different platform's network to International Representation(IR)☆44Updated 6 years ago
- Yet another Polyhedra Compiler for DeepLearning☆19Updated last year
- ☆11Updated last year
- Slides with modifications for a course at Tsinghua University.☆58Updated 2 years ago
- A tool convert TensorRT engine/plan to a fake onnx☆38Updated 2 years ago
- Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and opti…☆47Updated last year
- OneFlow Serving☆20Updated 2 months ago
- MegEngine到其他框架的转换器☆69Updated last year
- NVIDIA TensorRT Hackathon 2023复赛选题:通义千问Qwen-7B用TensorRT-LLM模型搭建及优化☆41Updated last year
- quantize aware training package for NCNN on pytorch☆70Updated 3 years ago
- ☆97Updated 3 years ago
- [TMLR] Official PyTorch implementation of paper "Efficient Quantization-aware Training with Adaptive Coreset Selection"☆30Updated 6 months ago
- 天池 NVIDIA TensorRT Hackathon 2023 —— 生成式AI模型优化赛 初赛第三名方案☆48Updated last year
- A codebase & model zoo for pretrained backbone based on MegEngine.☆33Updated last year
- Implementation of IceFormer: Accelerated Inference with Long-Sequence Transformers on CPUs (ICLR 2024).☆22Updated 8 months ago
- Various test models in WNNX format. It can view with `pip install wnetron && wnetron`☆12Updated 2 years ago
- ☆20Updated 2 years ago
- autoTVM神经网络推理代码优化搜索演示,基于tvm编译开源模型centerface,并使用autoTVM搜索最优推理代码, 最终部署编译为c++代码,演示平台是cuda,可以是其他平台,例如树莓派,安卓手机,苹果手机.Thi is a demonstration of …☆27Updated 3 years ago
- ☆44Updated 3 years ago
- Offline Quantization Tools for Deploy.☆123Updated last year
- [ICML 2022] "DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks", by Yonggan …☆71Updated 2 years ago