CPU Memory Compiler and Parallel programing
☆26Nov 18, 2024Updated last year
Alternatives and similar repositories for riven
Users that are interested in riven are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- flash attention tutorial written in python, triton, cuda, cutlass☆494Jan 20, 2026Updated 2 months ago
- KsanaDiT: High-Performance DiT (Diffusion Transformer) Inference Framework for Video & Image Generation☆46Mar 6, 2026Updated 3 weeks ago
- a simple WIP runtime reflection library☆13May 11, 2022Updated 3 years ago
- ☆23Aug 14, 2024Updated last year
- ☆119May 16, 2025Updated 10 months ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- YOLO for Uniform Directed Object detection☆13Mar 28, 2024Updated 2 years ago
- A Low-Overhead tool for Floating-Point Exception Detection in NVIDIA GPUs☆13Dec 17, 2024Updated last year
- ☆120Apr 11, 2024Updated last year
- ☆18Jan 4, 2024Updated 2 years ago
- (IJCAI 2023) Sph2Pob: Boosting Object Detection on Spherical Images with Planar Oriented Boxes Methods☆14Aug 23, 2023Updated 2 years ago
- Flash Attention in ~100 lines of CUDA (forward pass only)☆11Jun 10, 2024Updated last year
- Quantize yolov7 using pytorch_quantization.🚀🚀🚀☆12Oct 20, 2023Updated 2 years ago
- [ECCV 2024] FlexAttention for Efficient High-Resolution Vision-Language Models☆46Jan 8, 2025Updated last year
- 校招、秋招、春招、实习好项目,带你从零动手实现支持LLama2/3和Qwen2.5的大模型推理框架。☆519Oct 28, 2025Updated 5 months ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Some C++/C/CUDA Extension☆16Feb 2, 2022Updated 4 years ago
- ☆42Mar 4, 2026Updated 3 weeks ago
- [ACL2025 Oral🔥]Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling☆27Nov 11, 2025Updated 4 months ago
- An object tracking project with YOLOv5-v5.0 and Deepsort, speed up by C++ and TensorRT.☆17Oct 23, 2025Updated 5 months ago
- Using TensorRT accelerate Segformer.☆11Oct 6, 2023Updated 2 years ago
- CUDA 8-bit Tensor Core Matrix Multiplication based on m16n16k16 WMMA API☆35Sep 15, 2023Updated 2 years ago
- [arXiv 2026] Official PyTorch Repository for "Coarse-Guided Visual Generation via Weighted h-Transform Sampling"☆42Mar 16, 2026Updated last week
- cuda编程学习入门☆38Jul 22, 2024Updated last year
- ☆20Aug 20, 2025Updated 7 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Deep insight tensorrt, including but not limited to qat, ptq, plugin, triton_inference, cuda☆23Mar 8, 2026Updated 3 weeks ago
- CUTLASS and CuTe Examples☆134Nov 30, 2025Updated 3 months ago
- Rust bindings for SPDK☆12Mar 5, 2020Updated 6 years ago
- ☆12Aug 31, 2023Updated 2 years ago
- A tool convert TensorRT engine/plan to a fake onnx☆41Nov 22, 2022Updated 3 years ago
- SGLang Kernel Wheel Index☆18Updated this week
- HWFI: Hybrid Warping Fusion for Video Frame Interpolation. IJCV 2022☆11Sep 7, 2022Updated 3 years ago
- This project is about convolution operator optimization on GPU, include GEMM based (Implicit GEMM) convolution.☆43Sep 29, 2025Updated 6 months ago
- Performance Engineering of Software Systems (6.172)☆28Feb 27, 2020Updated 6 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- MicroMix: Efficient Mixed-Precision Quantization with Microscaling Formats for Large Language Models☆29Feb 12, 2026Updated last month
- YOLOv5 Quantization Aware Training with TensorRT☆19Jan 10, 2023Updated 3 years ago
- PointPillars TensorRT version pretrained on MMDetection3d with WaymoOpenDataset☆22Aug 11, 2022Updated 3 years ago
- CUDA SGEMM optimization note☆15Oct 31, 2023Updated 2 years ago
- ☆45Apr 7, 2022Updated 3 years ago
- ☆12Mar 13, 2023Updated 3 years ago
- Multi-Level Triton Runner supporting Python, IR, PTX, and cubin.☆84Updated this week