CPU Memory Compiler and Parallel programing
☆26Nov 18, 2024Updated last year
Alternatives and similar repositories for riven
Users that are interested in riven are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.☆16Aug 31, 2023Updated 2 years ago
- ICML2017 MEC: Memory-efficient Convolution for Deep Neural Network C++实现(非官方)☆17Apr 9, 2019Updated 7 years ago
- KsanaDiT: High-Performance DiT (Diffusion Transformer) Inference Framework for Video & Image Generation☆53May 13, 2026Updated 2 weeks ago
- ☆23Aug 14, 2024Updated last year
- rust 学习笔记☆11Jun 7, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- code repo for paper accepted in ICML 2023☆14Oct 19, 2023Updated 2 years ago
- ☆121Apr 11, 2024Updated 2 years ago
- A Low-Overhead tool for Floating-Point Exception Detection in NVIDIA GPUs☆13Dec 17, 2024Updated last year
- DanesfieldApp is web based application, for Danesfield Applications running at the back-end. Using for 3D Reconstruction from satellite i…☆12Oct 28, 2023Updated 2 years ago
- Flash Attention in ~100 lines of CUDA (forward pass only)☆12Jun 10, 2024Updated last year
- (IJCAI 2023) Sph2Pob: Boosting Object Detection on Spherical Images with Planar Oriented Boxes Methods☆14Aug 23, 2023Updated 2 years ago
- ☆12Feb 7, 2018Updated 8 years ago
- ☆15Oct 9, 2022Updated 3 years ago
- 基于QOpenGLWidget,实现点云载入,显示,鼠标键盘交互。点云的旋转,平移,放 大缩小等功能☆11May 7, 2020Updated 6 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- [ECCV 2024] FlexAttention for Efficient High-Resolution Vision-Language Models☆47Jan 8, 2025Updated last year
- detailed notes for PointNet☆11Oct 23, 2020Updated 5 years ago
- 校招、秋招、春招、实习好项目,带你从零动手实现支持LLama2/3和Qwen2.5的大模型推理框架。☆546Oct 28, 2025Updated 7 months ago
- Some C++/C/CUDA Extension☆16Feb 2, 2022Updated 4 years ago
- Google's MediaPipe (v0.8.9) and Python Wheel installer for Jetson Nano (JetPack 4.6) compiled for CUDA 10.2☆16Jun 7, 2023Updated 2 years ago
- ☆50Mar 4, 2026Updated 2 months ago
- Estimate depth from surface normal.☆12Aug 14, 2020Updated 5 years ago
- An object tracking project with YOLOv5-v5.0 and Deepsort, speed up by C++ and TensorRT.☆16Oct 23, 2025Updated 7 months ago
- Using TensorRT accelerate Segformer.☆11Oct 6, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Calibration of depth sensors, e.g. Kinect, Asus Xtion☆13Apr 26, 2019Updated 7 years ago
- cuda编程学习入门☆38Jul 22, 2024Updated last year
- ☆23Aug 20, 2025Updated 9 months ago
- Deep insight tensorrt, including but not limited to qat, ptq, plugin, triton_inference, cuda☆23May 10, 2026Updated 2 weeks ago
- CUTLASS and CuTe Examples☆135Nov 30, 2025Updated 5 months ago
- Rust bindings for SPDK☆12Mar 5, 2020Updated 6 years ago
- ☆12Aug 31, 2023Updated 2 years ago
- A tool convert TensorRT engine/plan to a fake onnx☆41Nov 22, 2022Updated 3 years ago
- [arXiv 2026] Official PyTorch Repository for "Coarse-Guided Visual Generation via Weighted h-Transform Sampling"☆42May 8, 2026Updated 3 weeks ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- SGLang Kernel Wheel Index☆22May 22, 2026Updated last week
- HWFI: Hybrid Warping Fusion for Video Frame Interpolation. IJCV 2022☆11Sep 7, 2022Updated 3 years ago
- This project is about convolution operator optimization on GPU, include GEMM based (Implicit GEMM) convolution.☆43Sep 29, 2025Updated 8 months ago
- ☆21Jun 9, 2025Updated 11 months ago
- MicroMix: Efficient Mixed-Precision Quantization with Microscaling Formats for Large Language Models☆28Apr 2, 2026Updated last month
- CUDA Templates for Linear Algebra Subroutines☆101Apr 25, 2024Updated 2 years ago
- A CircuitPython RPN Calculator☆12Jul 22, 2025Updated 10 months ago