Neural network from scratch in CUDA/C++
☆94Sep 8, 2025Updated 9 months ago
Alternatives and similar repositories for neural-network-cuda
Users that are interested in neural-network-cuda are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implement Neural Networks in Cuda from Scratch☆23May 17, 2024Updated 2 years ago
- A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.☆600May 13, 2026Updated 3 weeks ago
- Simple neural network implementation using CUDA technology. It is an educational implementation.☆99Apr 12, 2018Updated 8 years ago
- Standalone commandline CLI tool for compiling Triton kernels☆20Sep 13, 2024Updated last year
- Distributed Training of Bayesian Neural Networks at Scale☆11May 26, 2020Updated 6 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Nsight Compute In Docker☆13Dec 21, 2023Updated 2 years ago
- Play-with-compiler sandbox based on PWD☆10Oct 22, 2020Updated 5 years ago
- Code from the "CUDA Crash Course" YouTube series by CoffeeBeforeArch☆959Jul 19, 2023Updated 2 years ago
- Record GPU memory accesses of a CUDA program and visualize the access pattern in a browser☆13Nov 17, 2020Updated 5 years ago
- ☆11Jan 18, 2024Updated 2 years ago
- Goal: a website to automatically train and certify compiler researchers and developers☆10Nov 24, 2019Updated 6 years ago
- ☆10May 20, 2022Updated 4 years ago
- ☆10Jun 4, 2021Updated 5 years ago
- Scaling RLLib for generic simulation environments on Theta☆20Feb 16, 2023Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- This is a c++ implementation of an LSTM Neural Network parallelized for a GPU using CUDA☆25Oct 29, 2017Updated 8 years ago
- Code and pretrained models accompanying the paper "Ensembling geophysical models using Bayesian Neural Networks"☆10Jul 11, 2022Updated 3 years ago
- 数値計算100本ノック☆13Jan 27, 2020Updated 6 years ago
- Scalable Quantum Neural Network builds and trains a large-scale QNN in a modular fashion. SQNN is evaluated with a binary classification …☆12Oct 4, 2023Updated 2 years ago
- LLVM/MLIR based compiler instrumentation of AMD GPU kernels☆21Jul 13, 2025Updated 10 months ago
- Unsupervised Lifelong Person Re-identification via Contrastive Rehearsal☆11Apr 7, 2022Updated 4 years ago
- Rembg is a tool to remove images background.☆12Nov 29, 2022Updated 3 years ago
- Experiment of using Tangent to autodiff triton☆82Jan 22, 2024Updated 2 years ago
- A repo based on XiLin Li's PSGD repo that extends some of the experiments.☆14Oct 7, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Multi-GPU (CUDA-MPI) baseline implementation of Heat Equation and the inviscid Burgers' equation☆12Oct 17, 2017Updated 8 years ago
- Tensor Basis Neural Network for Scalar Mixing☆10Mar 24, 2023Updated 3 years ago
- An llvm pass for counting global uncoalesced acceses for cuda code via dynamic analysis.☆14Nov 17, 2018Updated 7 years ago
- Notes and toy codes...☆11Jul 5, 2019Updated 6 years ago
- Implementation of the paper ''Implicit Feature Refinement for Instance Segmentation''.☆20Oct 27, 2021Updated 4 years ago
- A Highly-Extensible Data Assimilation Testing Suite☆10Feb 24, 2019Updated 7 years ago
- Stanford Cars dataset by classes folder☆21Nov 7, 2024Updated last year
- ☆113Mar 12, 2026Updated 2 months ago
- Molecular Dynamic Graph Neural Network☆20Aug 5, 2021Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A translation and a rotation transform gizmo implemented in QML.☆18Apr 9, 2020Updated 6 years ago
- Pure Java Llama2 inference with optional multi-GPU CUDA implementation☆13Sep 2, 2023Updated 2 years ago
- lshash for python3☆10Mar 21, 2018Updated 8 years ago
- Quantized LLM training in pure CUDA/C++.☆246Jun 3, 2026Updated last week
- Exploit Auto-encoder for exploring and predict flow dynamic☆10Oct 4, 2019Updated 6 years ago
- The ALCF hosts a regular simulation, data, and learning workshop to help users scale their applications. This repository contains the exa…☆75Dec 17, 2025Updated 5 months ago
- Inference Llama/Llama2/Llama3 Modes in NumPy☆20Nov 22, 2023Updated 2 years ago