Libraries-Openly-Fused / cvGPUSpeedupLinks
A faster implementation of OpenCV-CUDA that uses OpenCV objects, and more!
☆54Updated 2 months ago
Alternatives and similar repositories for cvGPUSpeedup
Users that are interested in cvGPUSpeedup are comparing it to the libraries listed below
Sorting:
- Model compression for ONNX☆98Updated last year
- A tool convert TensorRT engine/plan to a fake onnx☆42Updated 3 years ago
- HunyuanDiT with TensorRT and libtorch☆18Updated last year
- Zero-copy multimodal vector DB with CUDA and CLIP/SigLIP☆65Updated 9 months ago
- Awesome code, projects, books, etc. related to CUDA☆30Updated last week
- ☆34Updated 7 months ago
- C++ implementations for various tokenizers (sentencepiece, tiktoken etc).☆48Updated this week
- EfficientViT is a new family of vision models for efficient high-resolution vision.☆30Updated 2 years ago
- Wanwu models release, code will be released soon☆24Updated 3 years ago
- IntLLaMA: A fast and light quantization solution for LLaMA☆18Updated 2 years ago
- We aim to redefine Data Parallel libraries portabiliy, performance, programability and maintainability, by using C++ standard features, i…☆47Updated this week
- [CVPR-2023] Towards Any Structural Pruning☆17Updated 2 years ago
- NVIDIA TensorRT Hackathon 2023复赛选题:通义千问Qwen-7B用TensorRT-LLM模型搭建及优化☆43Updated 2 years ago
- ONNX Command-Line Toolbox☆35Updated last year
- A nvImageCodec library of GPU- and CPU- accelerated codecs featuring a unified interface☆139Updated last month
- A CUDA kernel for NHWC GroupNorm for PyTorch☆22Updated last year
- A Toolkit to Help Optimize Large Onnx Model☆163Updated 3 months ago
- ☆28Updated 7 months ago
- Decoding Attention is specially optimized for MHA, MQA, GQA and MLA using CUDA core for the decoding stage of LLM inference.☆46Updated 7 months ago
- CLIP and SigLIP models optimized with TensorRT with a Transformers-like API☆30Updated last year
- An easy way to run, test, benchmark and tune OpenCL kernel files☆24Updated 2 years ago
- Snapdragon Neural Processing Engine (SNPE) SDKThe Snapdragon Neural Processing Engine (SNPE) is a Qualcomm Snapdragon software accelerate…☆37Updated 3 years ago
- New operators for the ReferenceEvaluator, new kernels for onnxruntime, CPU, CUDA☆35Updated 3 weeks ago
- Large Language Model Onnx Inference Framework☆36Updated 2 months ago
- 天池 NVIDIA TensorRT Hackathon 2023 —— 生成式AI模型优化赛 初赛第三名方案☆50Updated 2 years ago
- Nsight Systems In Docker☆21Updated 2 years ago
- [WIP] Better (FP8) attention for Hopper☆32Updated 11 months ago
- ☆18Updated 2 years ago
- Stable Diffusion in TensorRT 8.5+☆15Updated 2 years ago
- Generalist YOLO: Towards Real-Time End-to-End Multi-Task Visual Language Models☆87Updated 9 months ago