Starlight: A Kernel Optimizer for GPU Processing
☆16Jan 10, 2024Updated 2 years ago
Alternatives and similar repositories for starlight
Users that are interested in starlight are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Template Repository for Xilinx HLS design flow☆12Nov 18, 2021Updated 4 years ago
- Polyglot CUDA integration for the GraalVM☆18Apr 6, 2025Updated last year
- LOGAN: High-Performance Multi-GPU X-Drop Long-Read Alignment.☆30Sep 23, 2022Updated 3 years ago
- OpenMP front-end based on LLVM for CGRAs☆10Oct 2, 2022Updated 3 years ago
- A OpenCL-based FPGA benchmark suite for HPC☆37Jan 29, 2026Updated 2 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ARIES: An Agile MLIR-Based Compilation Flow for Reconfigurable Devices with AI Engines (FPGA 2025 Best Paper Nominee)☆62Mar 8, 2026Updated last month
- ☆41Mar 29, 2026Updated 2 weeks ago
- An awesome curated list of languages and tools to program FPGAs☆73Jun 22, 2022Updated 3 years ago
- ☆24Dec 1, 2020Updated 5 years ago
- Argonne Leadership Computing Facility OpenCL tutorial☆10Aug 22, 2025Updated 7 months ago
- A Scalable BFS Accelerator on FPGA-HBM Platform☆15Feb 22, 2024Updated 2 years ago
- [FPGA'21] Microbenchmarks for Demystifying the Memory System of Modern Datacenter FPGAs for Software Programmers☆31Dec 16, 2021Updated 4 years ago
- ☆39Mar 26, 2020Updated 6 years ago
- Ariston Net integration with home assistant☆10Nov 3, 2020Updated 5 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆12Mar 16, 2022Updated 4 years ago
- ☆25Jan 29, 2026Updated 2 months ago
- PYNQ bindings for C and C++ to avoid requiring Python or Vitis to execute hardware acceleration.☆31Apr 9, 2026Updated last week
- FPGA implementation of distributed union find algorithm☆31Apr 15, 2025Updated last year
- FPGA version of Rodinia in HLS C/C++☆42Dec 24, 2020Updated 5 years ago
- PyTorch Code for the Paper: "Exploiting Uncertainty of Loss Landscape for Stochastic Optimization [Bhaskara et al. (2019)]☆16Dec 8, 2025Updated 4 months ago
- A Data Science pipeline for Algorithmic Trading: A comparative study in applications to Finance and cryptoeconomics☆14Jul 1, 2022Updated 3 years ago
- ☆14Aug 28, 2019Updated 6 years ago
- Code for "Accelerating Training with Neuron Interaction and Nowcasting Networks" [ICLR 2025]☆28Feb 20, 2026Updated last month
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Educational verilog library that supports IEEE754 floating point arithmetic with a parametrizable mantissa and exponent☆32Mar 13, 2025Updated last year
- Qwen3-0.6B megakernel: 527 tok/s decode on RTX 3090 (3.8x faster than PyTorch)☆86Feb 10, 2026Updated 2 months ago
- ☆21Jan 23, 2024Updated 2 years ago
- Runs a single CUDA/OpenCL kernel, taking its source from a file and arguments from the command-line☆25Updated this week
- TileGraph is an experimental DNN compiler that utilizes static code generation and kernel fusion techniques.☆11Sep 18, 2024Updated last year
- ☆39Mar 14, 2024Updated 2 years ago
- ☆12Mar 1, 2024Updated 2 years ago
- CacheFlow is a Linux kernel module that exposes the contents of the last-level cache on *most* ARM machines.☆18Jun 19, 2024Updated last year
- Graphical user interface for tensor networks☆12Jul 27, 2020Updated 5 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Jax implementation of the AdaHessian optimizer☆20Mar 11, 2021Updated 5 years ago
- Collection of small examples for running on ALCF resources☆21Updated this week
- Single-cell analysis methods in Rust☆34Nov 4, 2025Updated 5 months ago
- train with kittens!☆64Oct 25, 2024Updated last year
- A dataset of egocentric vision, eye-tracking and full body kinematics from human locomotion in out-of-the-lab environments. Also, differe…☆12Nov 5, 2023Updated 2 years ago
- Manifold-Mixup implementation for fastai V1☆19Oct 1, 2020Updated 5 years ago
- ☆12Mar 18, 2024Updated 2 years ago