parrotsky / AutoDiCELinks

distributed CNN inference at the edge, extend ncnn with CUDA, MPI+OPENMP support.

☆19

Alternatives and similar repositories for AutoDiCE

Users that are interested in AutoDiCE are comparing it to the libraries listed below

Sorting:

Kyrie-Zhao / awesome-real-time-AI
This is a list of awesome edgeAI inference related papers.
☆96Updated last year
csu-eis / CoDL
☆77Updated 2 years ago
qipengwang / Melon
MobiSys#114
☆21Updated last year
Roxbili / TorchQuanter
Quantize pytorch model, support post-training quantization and quantization aware training methods
☆14Updated 2 years ago
UoS-EEC / DynamicOFA
[CVPRW 2021] Dynamic-OFA: Runtime DNN Architecture Switching for Performance Scaling on Heterogeneous Embedded Platforms
☆29Updated 2 years ago
xxxxyu / FlexNN
Code for ACM MobiCom 2024 paper "FlexNN: Efficient and Adaptive DNN Inference on Memory-Constrained Edge Devices"
☆55Updated 5 months ago
zhutmost / neuralzip
A Out-of-box PyTorch Scaffold for Neural Network Quantization-Aware-Training (QAT) Research. Website: https://github.com/zhutmost/neuralz…
☆26Updated 2 years ago
usc-isi / PipeEdge
PipeEdge: Pipeline Parallelism for Large-Scale Model Inference on Heterogeneous Edge Devices
☆35Updated last year
nycu-caslab / TinyTS
This is the open-source version of TinyTS. The code is dirty so far. We may clean the code in the future.
☆17Updated last year
caoting-dotcom / multiBranchModel
Multi-branch model for concurrent execution
☆17Updated 2 years ago
wangmaolin / niti
Implementation of "NITI: Training Integer Neural Networks Using Integer-only Arithmetic" on arxiv
☆84Updated 2 years ago
1hunters / LIMPQ
Official implementation for ECCV 2022 paper LIMPQ - "Mixed-Precision Neural Network Quantization via Learned Layer-wise Importance"
☆56Updated 2 years ago
zoranzhao / DeepThings
A Portable C Library for Distributed CNN Inference on IoT Edge Clusters
☆82Updated 5 years ago
ztt-21 / zTT
zTT: Learning-based DVFS with Zero Thermal Throttling for Mobile Devices [MobiSys'21] - Artifact Evaluation
☆25Updated 4 years ago
lixiuhong / batched_gemm
☆39Updated 5 years ago
Soroosh129 / NeuOS
Source code for the paper: "A Latency-Predictable Multi-Dimensional Optimization Framework forDNN-driven Autonomous Systems"
☆22Updated 4 years ago
microsoft / nn-Meter
A DNN inference latency prediction toolkit for accurately modeling and predicting the latency on diverse edge devices.
☆355Updated 11 months ago
fangvv / awesome-edge-intelligence-collections
About DNN compression and acceleration on Edge Devices.
☆55Updated 4 years ago
PannenetsF / TQT
TQT's pytorch implementation.
☆21Updated 3 years ago
BoyuanFeng / APNN-TC
☆19Updated 3 years ago
xudoong / EdgeVisionTransformer
To deploy Transformer models in CV to mobile devices.
☆18Updated 3 years ago
cap-lab / S3NAS
Fast NPU-aware Neural Architecture Search
☆22Updated 3 years ago
yeshaokai / ADMM-NN
☆36Updated 6 years ago
UDC-GAC / openCNN
A Winograd Minimal Filter Implementation in CUDA
☆25Updated 3 years ago
IntelLabs / FP8-Emulation-Toolkit
PyTorch extension for emulating FP8 data formats on standard FP32 Xeon/GPU hardware.
☆110Updated 7 months ago
EEESlab / CMix-NN
CMix-NN: Mixed Low-Precision CNN Library for Memory-Constrained Edge Devices
☆44Updated 5 years ago
Qualcomm-AI-research / pruning-vs-quantization
☆22Updated last year
harvard-acc / EdgeBERT
HW/SW co-design of sentence-level energy optimizations for latency-aware multi-task NLP inference
☆49Updated last year
chhzh123 / ptc-tutorial
PyTorch compilation tutorial covering TorchScript, torch.fx, and Slapo
☆18Updated 2 years ago
stepbuystep / LightNAS
You Only Search Once: On Lightweight Differentiable Architecture Search for Resource-Constrained Embedded Platforms
☆11Updated 2 years ago