morousg / cvGPUSpeedup
A faster implementation of OpenCV-CUDA that uses OpenCV objects, and more!
☆48Updated this week
Alternatives and similar repositories for cvGPUSpeedup:
Users that are interested in cvGPUSpeedup are comparing it to the libraries listed below
- A tool convert TensorRT engine/plan to a fake onnx☆37Updated 2 years ago
- Model compression for ONNX☆84Updated 2 months ago
- HunyuanDiT with TensorRT and libtorch☆17Updated 8 months ago
- CLIP and SigLIP models optimized with TensorRT with a Transformers-like API☆21Updated 4 months ago
- Zero-copy multimodal vector DB with CUDA and CLIP/SigLIP☆46Updated 8 months ago
- ☆31Updated 7 months ago
- ☆18Updated 2 years ago
- Python scripts performing optical flow estimation using the NeuFlowV2 model in ONNX.☆40Updated 5 months ago
- ☆21Updated last month
- EfficientViT is a new family of vision models for efficient high-resolution vision.☆24Updated last year
- IntLLaMA: A fast and light quantization solution for LLaMA☆18Updated last year
- Stable Diffusion in TensorRT 8.5+☆14Updated last year
- [CVPR-2023] Towards Any Structural Pruning☆16Updated last year
- A CUDA kernel for NHWC GroupNorm for PyTorch☆16Updated 3 months ago
- Open deep learning compiler stack for cpu, gpu and specialized accelerators☆18Updated last week
- 天池 NVIDIA TensorRT Hackathon 2023 —— 生成式AI模型优化赛 初赛第三名方案☆48Updated last year
- ☆15Updated 10 months ago
- Nsight Systems In Docker☆20Updated last year
- Implement Learning Efficient Convolutional Networks Through Network Slimming on YOLOX☆26Updated 2 years ago
- 📚FFPA: Yet antother Faster Flash Prefill Attention with O(1)⚡️SRAM complexity for headdim > 256, 1.8x~3x↑🎉faster than SDPA EA.☆96Updated this week
- Simple tool for partial optimization of ONNX. Further optimize some models that cannot be optimized with onnx-optimizer and onnxsim by se…☆19Updated 9 months ago
- A Toolkit to Help Optimize Onnx Model☆113Updated 2 weeks ago
- FlexAttention w/ FlashAttention3 Support☆26Updated 4 months ago
- Instance and panoptic segmentation using yolov9 in onnx☆11Updated 10 months ago
- An easy way to run, test, benchmark and tune OpenCL kernel files☆23Updated last year
- TensorRT Acceleration for PyTorch Native Eager Mode Quantization Models☆14Updated 6 months ago
- NVIDIA TensorRT Hackathon 2023复赛选题:通义千问Qwen-7B用TensorRT-LLM模型搭建及优化☆41Updated last year
- A nvImageCodec library of GPU- and CPU- accelerated codecs featuring a unified interface☆91Updated last month