CPU Memory Compiler and Parallel programing
☆26Nov 18, 2024Updated last year
Alternatives and similar repositories for riven
Users that are interested in riven are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- flash attention tutorial written in python, triton, cuda, cutlass☆502Jan 20, 2026Updated 2 months ago
- a simple WIP runtime reflection library☆13May 11, 2022Updated 3 years ago
- ☆23Aug 14, 2024Updated last year
- ☆119May 16, 2025Updated 11 months ago
- Documentations for RELION☆14Mar 13, 2026Updated last month
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆120Apr 11, 2024Updated 2 years ago
- ☆18Jan 4, 2024Updated 2 years ago
- Flash Attention in ~100 lines of CUDA (forward pass only)☆12Jun 10, 2024Updated last year
- Quantize yolov7 using pytorch_quantization.🚀🚀🚀☆12Oct 20, 2023Updated 2 years ago
- ☆12Feb 7, 2018Updated 8 years ago
- 基于QOpenGLWidget,实现点云载入,显示,鼠标键盘交互。点云的旋转,平移,放大缩小等功能☆11May 7, 2020Updated 5 years ago
- Kunpeng Tech Blog: https://kunpengcompute.github.io/☆19Jul 8, 2021Updated 4 years ago
- detailed notes for PointNet☆11Oct 23, 2020Updated 5 years ago
- 校招、秋招、春招、实习好项目,带你从零动手实现支持LLama2/3和Qwen2.5的大模型推理框架。☆527Oct 28, 2025Updated 5 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Google's MediaPipe (v0.8.9) and Python Wheel installer for Jetson Nano (JetPack 4.6) compiled for CUDA 10.2☆16Jun 7, 2023Updated 2 years ago
- ☆46Mar 4, 2026Updated last month
- Estimate depth from surface normal.☆12Aug 14, 2020Updated 5 years ago
- [ACL2025 Oral🔥]Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling☆27Nov 11, 2025Updated 5 months ago
- An object tracking project with YOLOv5-v5.0 and Deepsort, speed up by C++ and TensorRT.☆16Oct 23, 2025Updated 5 months ago
- Using TensorRT accelerate Segformer.☆11Oct 6, 2023Updated 2 years ago
- CUDA 8-bit Tensor Core Matrix Multiplication based on m16n16k16 WMMA API☆36Sep 15, 2023Updated 2 years ago
- cuda编程学习入门☆38Jul 22, 2024Updated last year
- ☆22Aug 20, 2025Updated 7 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Deep insight tensorrt, including but not limited to qat, ptq, plugin, triton_inference, cuda☆23Updated this week
- CUTLASS and CuTe Examples☆134Nov 30, 2025Updated 4 months ago
- SPBench: A Framework for Benchmarking Stream Processing Applications☆11Dec 16, 2025Updated 4 months ago
- Rust bindings for SPDK☆12Mar 5, 2020Updated 6 years ago
- A tool convert TensorRT engine/plan to a fake onnx☆41Nov 22, 2022Updated 3 years ago
- SGLang Kernel Wheel Index☆21Updated this week
- HWFI: Hybrid Warping Fusion for Video Frame Interpolation. IJCV 2022☆11Sep 7, 2022Updated 3 years ago
- This project is about convolution operator optimization on GPU, include GEMM based (Implicit GEMM) convolution.☆43Sep 29, 2025Updated 6 months ago
- Performance Engineering of Software Systems (6.172)☆28Feb 27, 2020Updated 6 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- MicroMix: Efficient Mixed-Precision Quantization with Microscaling Formats for Large Language Models☆28Apr 2, 2026Updated 2 weeks ago
- CUDA Templates for Linear Algebra Subroutines☆102Apr 25, 2024Updated last year
- Unofficial implementation of Chain of Hindsight (https://arxiv.org/abs/2302.02676) using pytorch and huggingface Trainers.☆11Apr 5, 2023Updated 3 years ago
- CUDA SGEMM optimization note☆15Oct 31, 2023Updated 2 years ago
- YOLOv5 Quantization Aware Training with TensorRT☆19Jan 10, 2023Updated 3 years ago
- PointPillars TensorRT version pretrained on MMDetection3d with WaymoOpenDataset☆23Aug 11, 2022Updated 3 years ago
- ☆45Apr 7, 2022Updated 4 years ago