SYCL implementation of Fused MLPs for Intel GPUs
☆51Nov 24, 2025Updated 5 months ago
Alternatives and similar repositories for tiny-dpcpp-nn
Users that are interested in tiny-dpcpp-nn are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆65Jun 30, 2025Updated 10 months ago
- ☆61Dec 18, 2024Updated last year
- C++ pipeline with OpenVINO native API for Stable Diffusion v1.5☆13Feb 23, 2024Updated 2 years ago
- ☆24Apr 7, 2026Updated last month
- Ribbon Menu for React☆20Feb 23, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Code for paper "Beyond Closure Models: Learning Chaotic Systems via Physics-Informed Neural Operators".☆16Dec 24, 2025Updated 4 months ago
- Benchmarks of different devices I have come across☆42Aug 28, 2025Updated 8 months ago
- CPU and GPU tutorial examples☆13Apr 4, 2025Updated last year
- ☆96May 10, 2026Updated last week
- ☆22Apr 15, 2026Updated last month
- Reader for CalculiX .dat files☆11May 12, 2025Updated last year
- Wave: Python Domain-Specific Language for High Performance Machine Learning☆55May 4, 2026Updated 2 weeks ago
- Super fast FP32 matrix multiplication on RDNA3☆90Mar 30, 2025Updated last year
- Active learning of extreme events using deep neural operators.☆16Nov 10, 2022Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Zig regex experiment☆13Nov 6, 2025Updated 6 months ago
- Sub-module for OpenFOAM that provides a solver for embedding SmartSim and its external dependencies (i.e. SmartRedis) into OpenFOAM.☆44Sep 10, 2025Updated 8 months ago
- Official Implementation of "AIVT: Inference of turbulent thermal convection from measured 3D velocity data by physics-informed Kolmogorov…☆16Oct 10, 2025Updated 7 months ago
- JIT Compilation for Multiple Architectures: C++, OpenMP, CUDA, HIP, OpenCL, Metal☆13Aug 6, 2025Updated 9 months ago
- A scalable implementation of diffusion and flow-matching with XGBoost models, applied to calorimeter data.☆22Mar 23, 2026Updated last month
- SYCL* Templates for Linear Algebra (SYCL*TLA) - SYCL based CUTLASS implementation for Intel GPUs☆74Updated this week
- A portable implementation of SZ lossy compression for AMD GPUs and Hygon DCUs.☆10Feb 26, 2025Updated last year
- ☆18Aug 9, 2023Updated 2 years ago
- scalable data movement in Exascale Supercomputers☆19Mar 30, 2026Updated last month
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ROCm Documentation Python package for ReadTheDocs build standardization☆17Updated this week
- Profiling Tools Interfaces for GPU (PTI for GPU) is a set of Getting Started Documentation and Tools Library to start performance analysi…☆267May 13, 2026Updated last week
- Julian macros for wrapping ccall☆14May 27, 2021Updated 4 years ago
- ☆36May 1, 2026Updated 3 weeks ago
- ☆13Aug 22, 2025Updated 9 months ago
- A tool allowing students of Coursera's Heterogeneous Parallel Programming to work on homework using a machine without a CUDA GPU.☆11Mar 11, 2015Updated 11 years ago
- Ansible Role - Easy and flexible dotfile installation with stow.☆11Oct 6, 2023Updated 2 years ago
- The official repository of paper "ScaleLong: Towards More Stable Training of Diffusion Model via Scaling Network Long Skip Connection" (N…☆50Oct 23, 2023Updated 2 years ago
- Separabale Physics-Informed DeepONets in JAX☆25Nov 29, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Commands that will make you more comfortable with the ROCm toolkit.☆18Aug 1, 2024Updated last year
- ☆93Updated this week
- Proof of concept for type system with unions, intersections and complements.☆14Apr 21, 2023Updated 3 years ago
- Part of 5th place solution for Peking University/Baidu - Autonomous Driving on Kaggle (https://www.kaggle.com/c/pku-autonomous-driving).☆23Sep 11, 2020Updated 5 years ago
- A tool for generating information about the matrix multiplication instructions in AMD Radeon™ and AMD Instinct™ accelerators☆134Apr 10, 2026Updated last month
- ☆20Mar 27, 2023Updated 3 years ago
- The vLLM XPU kernels for Intel GPU☆44Updated this week