SqueezeBits / owlite-examplesLinks

OwLite Examples repository offers illustrative example codes to help users seamlessly compress PyTorch deep learning models and transform them into TensorRT engines.

☆10

Alternatives and similar repositories for owlite-examples

Users that are interested in owlite-examples are comparing it to the libraries listed below

Sorting:

ConstantPark / DL_Compiler
Study Group of Deep Learning Compiler
☆163Updated 2 years ago
junstar92 / nvidia-libraries-study
☆54Updated 9 months ago
SqueezeBits / owlite
OwLite is a low-code AI model compression toolkit for AI models.
☆50Updated 3 months ago
xvyaward / owq
Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Model…
☆64Updated last year
sjquan / 2022-Study
☆56Updated 2 years ago
VIA-Research / vTrain
☆73Updated 3 months ago
naver-aics / lut-gemm
☆68Updated last year
SqueezeBits / QUICK
QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inference
☆119Updated last year
fredrickang / LaLaRAND
LaLaRAND: Flexible Layer-by-Layer CPU/GPU Scheduling for Real-Time DNN Tasks
☆15Updated 3 years ago
etri / nest-compiler
NEST Compiler
☆117Updated 6 months ago
PyTorchKorea / pytorchcore-kr
PyTorch CoreSIG
☆56Updated 8 months ago
swsnu / aisys2023
☆103Updated 2 years ago
mlsys-seo / ooo-backprop
☆25Updated 2 years ago
HabanaAI / Habana_Custom_Kernel
Provides the examples to write and build Habana custom kernels using the HabanaTools
☆22Updated 4 months ago
SNU-ARC / any-precision-llm
[ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs
☆113Updated last month
microsoft / microxcaling
PyTorch emulation library for Microscaling (MX)-compatible data formats
☆288Updated 2 months ago
ranggihwang / Pregated_MoE
☆51Updated last year
ROCm / tensorcast
☆13Updated 2 weeks ago
Qualcomm-AI-research / FP8-quantization
☆158Updated 2 years ago
efficient-ai-study / efficient-ai-study
☆90Updated last year
Qualcomm-AI-research / transformer-quantization
☆206Updated 3 years ago
VIA-Research / dpsgd_profiler
☆20Updated 8 months ago
snu-comparch / Tender
Tender: Accelerating Large Language Models via Tensor Decompostion and Runtime Requantization (ISCA'24)
☆20Updated last year
mrsnu / band
Multi-DNN Inference Engine for Heterogeneous Mobile Processors
☆34Updated last year
aojunzz / NM-sparsity
☆238Updated 2 years ago
rain-neuromorphics / torchmx
PyTorch Quantization Framework For OCP MX Datatypes.
☆13Updated 3 months ago
DD-DuDa / awesome-vit-quantization-acceleration
List of papers related to Vision Transformers quantization and hardware acceleration in recent AI conferences and journals.
☆94Updated last year
kssteven418 / I-BERT
[ICML'21 Oral] I-BERT: Integer-only BERT Quantization
☆255Updated 2 years ago
junstar92 / parallel_programming_study
Study parallel programming - CUDA, OpenMP, MPI, Pthread
☆58Updated 3 years ago
clevercool / ANT-Quantization
☆107Updated last year