Libraries-Openly-Fused / cvGPUSpeedupLinks

A faster implementation of OpenCV-CUDA that uses OpenCV objects, and more!

☆54

Alternatives and similar repositories for cvGPUSpeedup

Users that are interested in cvGPUSpeedup are comparing it to the libraries listed below

Sorting:

triple-Mu / TensorRT2ONNX
A tool convert TensorRT engine/plan to a fake onnx
☆41Updated 3 years ago
onnx / neural-compressor
Model compression for ONNX
☆99Updated last year
triple-Mu / HunyuanDiT-TensorRT-libtorch
HunyuanDiT with TensorRT and libtorch
☆18Updated last year
Libraries-Openly-Fused / FusedKernelLibrary
Implementation of a methodology that allows all sorts of user defined GPU kernel fusion, for non CUDA programmers.
☆37Updated this week
NVlabs / EfficientDL
☆34Updated 6 months ago
caibucai22 / awesome-cuda
Awesome code, projects, books, etc. related to CUDA
☆28Updated 2 weeks ago
zhenhuaw-me / onnxcli
ONNX Command-Line Toolbox
☆35Updated last year
meta-pytorch / tokenizers
C++ implementations for various tokenizers (sentencepiece, tiktoken etc).
☆44Updated last week
dusty-nv / NanoDB
Zero-copy multimodal vector DB with CUDA and CLIP/SigLIP
☆64Updated 7 months ago
leimao / Nsight-Systems-Docker-Image
Nsight Systems In Docker
☆20Updated 2 years ago
megvii-research / IntLLaMA
IntLLaMA: A fast and light quantization solution for LLaMA
☆18Updated 2 years ago
Oneflow-Inc / OneFlow-Pruning
[CVPR-2023] Towards Any Structural Pruning
☆17Updated 2 years ago
Bruce-Lee-LY / decoding_attention
Decoding Attention is specially optimized for MHA, MQA, GQA and MLA using CUDA core for the decoding stage of LLM inference.
☆46Updated 6 months ago
MILVLG / mlc-imp
Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
☆12Updated last year
jimmy-evo / opencl_kernels
An easy way to run, test, benchmark and tune OpenCL kernel files
☆24Updated 2 years ago
TRT2022 / ControlNet_TensorRT
天池 NVIDIA TensorRT Hackathon 2023 —— 生成式AI模型优化赛初赛第三名方案
☆50Updated 2 years ago
CVHub520 / efficientvit
EfficientViT is a new family of vision models for efficient high-resolution vision.
☆29Updated 2 years ago
FeiGeChuanShu / trt2023
NVIDIA TensorRT Hackathon 2023复赛选题：通义千问Qwen-7B用TensorRT-LLM模型搭建及优化
☆43Updated 2 years ago
Oneflow-Inc / vision
Datasets, Transforms and Models specific to Computer Vision
☆90Updated 2 years ago
inisis / OnnxLLM
Large Language Model Onnx Inference Framework
☆36Updated last month
quic / efficient-transformers
This library empowers users to seamlessly port pretrained models and checkpoints on the HuggingFace (HF) hub (developed using HF transfor…
☆85Updated this week
lucasjinreal / wanwu_release
Wanwu models release, code will be released soon
☆24Updated 3 years ago
ahennequ / cuda-tensorcores-register-mapping
☆19Updated 3 years ago
NVIDIA / nvImageCodec
A nvImageCodec library of GPU- and CPU- accelerated codecs featuring a unified interface
☆130Updated 3 weeks ago
MegEngine / mgeconvert
MegEngine到其他框架的转换器
☆69Updated 2 years ago
latentCall145 / channels-last-groupnorm
A CUDA kernel for NHWC GroupNorm for PyTorch
☆22Updated last year
AXERA-TECH / CLIP-ONNX-AX650-CPP
☆27Updated 5 months ago
lucasjinreal / wnnx_models
Various test models in WNNX format. It can view with `pip install wnetron && wnetron`
☆12Updated 3 years ago
triple-Mu / Stable-Diffusion-TensorRT
Stable Diffusion in TensorRT 8.5+
☆15Updated 2 years ago
Oneflow-Inc / oneflow-lite
☆18Updated last year