Starlight: A Kernel Optimizer for GPU Processing
☆16Jan 10, 2024Updated 2 years ago
Alternatives and similar repositories for starlight
Users that are interested in starlight are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Image Registration on FPGAs☆21Aug 28, 2022Updated 3 years ago
- ☆13Apr 15, 2025Updated 11 months ago
- A collection of Matplotlib and Seaborn recipes and utilities collected over years of colorful plot-making☆22Nov 17, 2023Updated 2 years ago
- A OpenCL-based FPGA benchmark suite for HPC☆37Jan 29, 2026Updated last month
- ☆15Nov 30, 2023Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Tuning Assistant for Floating point to Fixed point Optimization☆19Mar 26, 2022Updated 4 years ago
- ARIES: An Agile MLIR-Based Compilation Flow for Reconfigurable Devices with AI Engines (FPGA 2025 Best Paper Nominee)☆60Mar 8, 2026Updated 2 weeks ago
- ☆38Jul 6, 2025Updated 8 months ago
- ☆24Dec 1, 2020Updated 5 years ago
- Argonne Leadership Computing Facility OpenCL tutorial☆10Aug 22, 2025Updated 7 months ago
- ETHZ Heterogeneous Accelerated Compute Cluster.☆38Oct 7, 2025Updated 5 months ago
- Hands-on experience programming AI Engines using Vitis Unified Software Platform☆40Jul 24, 2024Updated last year
- [FPGA'21] Microbenchmarks for Demystifying the Memory System of Modern Datacenter FPGAs for Software Programmers☆31Dec 16, 2021Updated 4 years ago
- ☆40Mar 26, 2020Updated 6 years ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- Ariston Net integration with home assistant☆10Nov 3, 2020Updated 5 years ago
- FPGA implementation of distributed union find algorithm☆31Apr 15, 2025Updated 11 months ago
- FPGA version of Rodinia in HLS C/C++☆42Dec 24, 2020Updated 5 years ago
- AIM: Accelerating Arbitrary-precision Integer Multiplication on Heterogeneous Reconfigurable Computing Platform Versal ACAP (Full Paper a…☆26May 18, 2025Updated 10 months ago
- ☆15Nov 30, 2023Updated 2 years ago
- PyTorch Code for the Paper: "Exploiting Uncertainty of Loss Landscape for Stochastic Optimization [Bhaskara et al. (2019)]☆16Dec 8, 2025Updated 3 months ago
- A Data Science pipeline for Algorithmic Trading: A comparative study in applications to Finance and cryptoeconomics☆14Jul 1, 2022Updated 3 years ago
- Educational verilog library that supports IEEE754 floating point arithmetic with a parametrizable mantissa and exponent☆32Mar 13, 2025Updated last year
- Qwen3-0.6B megakernel: 527 tok/s decode on RTX 3090 (3.8x faster than PyTorch)☆83Feb 10, 2026Updated last month
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Optimize GEMM with tensorcore step by step☆37Dec 17, 2023Updated 2 years ago
- Hands-on HPC I/O tutorial material☆18Oct 9, 2025Updated 5 months ago
- ☆21Jan 23, 2024Updated 2 years ago
- some mixture of experts architecture implementations☆26Mar 22, 2024Updated 2 years ago
- This is the official code for CoRL 2022 "Robustness Certification of Visual Perception Models via Camera Motion Smoothing"☆11Apr 5, 2023Updated 2 years ago
- Official code of MoSA (Mixture of Sparse Adapters).☆13Dec 14, 2023Updated 2 years ago
- Runs a single CUDA/OpenCL kernel, taking its source from a file and arguments from the command-line☆24Updated this week
- TileGraph is an experimental DNN compiler that utilizes static code generation and kernel fusion techniques.☆11Sep 18, 2024Updated last year
- ☆38Mar 14, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Eye-MMS: Miniature multi-scale segmentation network of key eye-regions in embedded applications☆12Jul 4, 2022Updated 3 years ago
- CacheFlow is a Linux kernel module that exposes the contents of the last-level cache on *most* ARM machines.☆17Jun 19, 2024Updated last year
- Graphical user interface for tensor networks☆12Jul 27, 2020Updated 5 years ago
- A collection of examples of continuous analytics.☆15Sep 27, 2022Updated 3 years ago
- Single-cell analysis methods in Rust☆30Nov 4, 2025Updated 4 months ago
- train with kittens!☆64Oct 25, 2024Updated last year
- A tensorflow implementation of YOLOv4. CSPDarknet53 PAN SPP CIoU Mish,☆13Sep 11, 2020Updated 5 years ago