Obsolete version of CUDA-mode repo -- use cuda-mode/lectures instead
☆27Feb 8, 2024Updated 2 years ago
Alternatives and similar repositories for lecture2
Users that are interested in lecture2 are comparing it to the libraries listed below
Sorting:
- TileGraph is an experimental DNN compiler that utilizes static code generation and kernel fusion techniques.☆11Sep 18, 2024Updated last year
- ☆20May 28, 2024Updated last year
- MobileSAM のエンコーダー/デコーダーをONNXに変換し、推論するサンプル☆11Apr 11, 2024Updated last year
- Extract streaming data from text using prefix completion.☆10Oct 6, 2024Updated last year
- Mixture of Expert (MoE) techniques for enhancing LLM performance through expert-driven prompt mapping and adapter combinations.☆12Feb 11, 2024Updated 2 years ago
- Resources for deep learning with satellite & aerial imagery☆14Sep 29, 2021Updated 4 years ago
- ☆12Nov 5, 2024Updated last year
- YOLOv12 TensorRT 端到端模型加速推理和INT8量化实现☆12Mar 5, 2025Updated last year
- Converts CLIP models to ONNX☆10Jan 17, 2023Updated 3 years ago
- open source version of Umbra☆17Aug 11, 2023Updated 2 years ago
- Distributed Online Service Coordination Using Deep Reinforcement Learning☆19Sep 4, 2023Updated 2 years ago
- 使用ONNXRuntime部署一种用于边缘检测的轻量级密集卷积神经网络LDC,包含C++和Python两个版本的程序☆10Apr 24, 2023Updated 2 years ago
- SIMD accelerated glm alternative for C/C++☆16Jan 3, 2023Updated 3 years ago
- DETR tensor去除推理过程无用辅助头+fp16部署再次加速+解决转tensorrt 输出全为0问题的新方法。☆10Jan 9, 2024Updated 2 years ago
- Inference deployment of the llama3☆10Apr 21, 2024Updated last year
- No code solution for training tabular models☆35Jan 25, 2026Updated last month
- ffmpeg+cuvid+tensorrt+multicamera☆11Dec 31, 2024Updated last year
- 🎉My Collections of CUDA Kernels~☆10Jun 25, 2024Updated last year
- Python scripts performing Open Vocabulary Object Detection using the YOLO-World model in ONNX. And Export the ONNX model for AXera's NPU☆11Aug 11, 2025Updated 7 months ago
- This repository provides tutorial, which discusses running sample publisher and subscriber using multiple transports of point_cloud_trans…☆10Updated this week
- lightNet (Object Detection and Semantic Segmentation) for ONNX and TensorRT☆15Jul 4, 2023Updated 2 years ago
- ☆20Aug 8, 2024Updated last year
- FastSAM 部署rknn C++ 代码☆13May 30, 2024Updated last year
- BPE tokenization implemented in Golang 💙☆11Oct 2, 2023Updated 2 years ago
- YoloV8 segmentation NPU for the RK 3566/68/88☆17Apr 30, 2024Updated last year
- 重构nerf代码,更加容易读懂☆13Mar 26, 2023Updated 2 years ago
- ☆176Feb 3, 2024Updated 2 years ago
- CenterNet3D 部署版本,便于移植不同平台(onnx、tensorRT、rknn、Horizon)。☆12May 24, 2024Updated last year
- ☆11Dec 16, 2021Updated 4 years ago
- Multivariate Time Series Data usable for Time Series Segmentation and Time Series Classification. Each sample represents the multi-phased…☆11Apr 20, 2024Updated last year
- Example for Logging LLM Evaluator Prompt Responses☆18Aug 14, 2023Updated 2 years ago
- A Data Source for Reasoning Embodied Agents☆19Sep 18, 2023Updated 2 years ago
- 基于 CUDA Driver API 的 cuda 运行时环境☆15Jul 30, 2025Updated 7 months ago
- Material for gpu-mode lectures☆5,865Feb 1, 2026Updated last month
- Multiple Lidar preprocessor for BEVfusion☆10Aug 25, 2023Updated 2 years ago
- Recording models☆11Sep 19, 2023Updated 2 years ago
- High performance RMSNorm Implement by using SM Core Storage(Registers and Shared Memory)☆29Jan 22, 2026Updated 2 months ago
- 大模型API性能指标比较 - 深入分析TTFT、TPS等关键指标☆19Sep 12, 2024Updated last year
- A simple interface into controlling the mac trackpad haptic feedback from rust.☆24Jun 1, 2024Updated last year