Pegessi/conv2d_direct

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Pegessi/conv2d_direct)

Pegessi / conv2d_direct

☆36

Alternatives and similar repositories for conv2d_direct

Users that are interested in conv2d_direct are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

latentCall145 / channels-last-groupnorm
View on GitHub
A CUDA kernel for NHWC GroupNorm for PyTorch
☆23Nov 15, 2024Updated last year
nDIRECT / nDIRECT
View on GitHub
A direct convolution library targeting ARM multi-core CPUs.
☆12Nov 27, 2024Updated last year
Manojbhat09 / nanoVLA
View on GitHub
minimal Vision Language Action framework for robot control systems
☆17Sep 15, 2025Updated 10 months ago
EnigmaHuang / Saad_Book_ForTran
View on GitHub
Some "Formula Translations" for Yousef Saad's book "Iterative Methods for Sparse Linear Systems (2nd Edition)"
☆13Jan 14, 2018Updated 8 years ago
yhwang-hub / OrinMLLM
View on GitHub
This project is primarily used to deploy large language models and multimodal large models on Orin.🚀🚀🚀
☆18Jun 23, 2026Updated last month
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ranran0523 / SPECNN
View on GitHub
code repo for paper accepted in ICML 2023
☆13Oct 19, 2023Updated 2 years ago
MegEngine / examples
View on GitHub
A set of examples around MegEngine
☆31Dec 8, 2023Updated 2 years ago
NickCao / cumu
View on GitHub
☆10Sep 23, 2023Updated 2 years ago
sfilippone / mld2p4-2
View on GitHub
☆14Jul 16, 2020Updated 6 years ago
chemeng / GPGPU-GMRES-Method
View on GitHub
CUDA GPU implementation of GMRES iterative Solver
☆10Apr 16, 2012Updated 14 years ago
njuhope / cuda_sgemm
View on GitHub
☆121Apr 11, 2024Updated 2 years ago
gevtushenko / block_matrix_format_performance
View on GitHub
☆12Jan 19, 2020Updated 6 years ago
melonedo / algebraic-layouts
View on GitHub
☆23Aug 20, 2025Updated 11 months ago
kurenaif / auto_wmake
View on GitHub
OpenFOAM right wmake at the right time
☆11Mar 10, 2019Updated 7 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
takahiro-hirofuchi / mesmeric-emulator
View on GitHub
MESMERIC: A Software-based NVM Emulator Supporting Read/Write Asymmetric Latencies
☆10Oct 1, 2020Updated 5 years ago
daajoe / GPUSAT
View on GitHub
☆12Sep 29, 2021Updated 4 years ago
YuxueYang1204 / CudaDemo
View on GitHub
Implement custom operators in PyTorch with cuda/c++
☆77Jan 1, 2023Updated 3 years ago
0xSero / deepseek-v4-flash-sm120
View on GitHub
☆32Apr 26, 2026Updated 3 months ago
gouarin / GenEO
View on GitHub
☆10Jan 13, 2023Updated 3 years ago
Lucieno / gforce-public
View on GitHub
A crypto-assisted framework for protecting the privacy of models and queries in inference.
☆19Oct 28, 2021Updated 4 years ago
shuzhangzhong / HybriMoE-Preview
View on GitHub
☆17Apr 9, 2025Updated last year
RohanNagar / parallel-logic-networks
View on GitHub
Gate-Level Simulation on a GPU
☆10Nov 22, 2016Updated 9 years ago
ZHEQIUSHUI / SAM-ONNX-AX650-CPP
View on GitHub
SAM and lama inpaint，包含QT的GUI交互界面，实现了交互式可实时显示结果的画点、画框进行SAM，然后通过进行Inpaint，具体操作看readme里的视频。
☆54Jan 30, 2024Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
arirepo / paraGMRES
View on GitHub
Massively Scalable Parallel GMRES C-code for Sparse System of Equations
☆13Feb 16, 2016Updated 10 years ago
SeungHwi0613 / ros2_bevfusion_demo
View on GitHub
☆32Aug 25, 2023Updated 2 years ago
MadaoFY / yolov5_TensorRT_inference
View on GitHub
记录yolov5的TensorRT量化及推理代码，经实测可运行于Jetson平台
☆20May 11, 2023Updated 3 years ago
YangLinzhuo / cuda-sgemm-optimization
View on GitHub
CUDA SGEMM optimization note
☆15Oct 31, 2023Updated 2 years ago
ZJLi2013 / awesome-kernel-skills
View on GitHub
☆88Mar 31, 2026Updated 3 months ago
mrzhuzhe / riven
View on GitHub
CPU Memory Compiler and Parallel programing
☆26Nov 18, 2024Updated last year
ahennequ / cuda-tensorcores-register-mapping
View on GitHub
☆19Oct 3, 2022Updated 3 years ago
guoliefeng / lightning-lm_ROS
View on GitHub
lightning-lm ROS1 noetic 旧时代的残党
☆27Updated this week
toufique-morshed / CPU-GPU-TFHE
View on GitHub
A CPU and GPU accelerated framework for TFHE. The framework includes algebraic, vector, and matrix operations.
☆21Apr 15, 2020Updated 6 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
c3sr / tcu_scope
View on GitHub
☆50Jun 27, 2019Updated 7 years ago
b-carter / SufficientInputSubsets
View on GitHub
Code for Sufficient Input Subsets Paper
☆14Mar 8, 2019Updated 7 years ago
yulonghui / MOCA
View on GitHub
Official implementation of "Continual Learning by Modeling Intra-Class Variation" (MOCA). [TMLR 2023]
☆16Mar 3, 2023Updated 3 years ago
jotaviobiondo / llvm-register-allocator
View on GitHub
A graph coloring register allocator for LLVM.
☆11Jan 23, 2017Updated 9 years ago
YusukeNagasaka / Batched-SpMM
View on GitHub
New batched algorithm for sparse matrix-matrix multiplication (SpMM)
☆16May 7, 2019Updated 7 years ago
Mark-ThinkPad / TCP_Robot
View on GitHub
计算机网络课程设计, 基于TCP协议的简易聊天机器人, 开发语言Python3, 初期版本只能在终端中运行(CLI), 最终完成版为客户端编写了"简陋"的图形界面, 使用Qt5(即PyQt5)实现
☆10Jun 17, 2019Updated 7 years ago
MIoTLab / Acclaim
View on GitHub
Acclaim: Adaptive Memory Reclaim to Improve User Experience in Android Systems [ATC '20]
☆16Aug 1, 2020Updated 5 years ago