modified cutlass
☆15Oct 26, 2020Updated 5 years ago
Alternatives and similar repositories for cutlass-bak
Users that are interested in cutlass-bak are comparing it to the libraries listed below
Sorting:
- World's first Nintendo 3DS emulator for Apple devices based on Citra.☆18Apr 7, 2023Updated 2 years ago
- Implementation of TSM2L and TSM2R -- High-Performance Tall-and-Skinny Matrix-Matrix Multiplication Algorithms for CUDA☆35Jul 28, 2020Updated 5 years ago
- A unified programming framework for high and portable performance across FPGAs and GPUs☆11Mar 23, 2025Updated 11 months ago
- Polyite: Iterative Schedule Optimization for Parallelization in the Polyhedron Model☆12Jan 19, 2020Updated 6 years ago
- GEMM and Winograd based convolutions using CUTLASS☆28Jul 15, 2020Updated 5 years ago
- Utilities for paper writing.☆12Jan 11, 2026Updated last month
- ☆16Sep 24, 2024Updated last year
- ☆15Dec 16, 2021Updated 4 years ago
- HeteroHalide: From Image Processing DSL to Efficient FPGA Acceleration☆15Sep 14, 2020Updated 5 years ago
- Machine Intelligence Shader Autogen. AMDGPU ML shader code generator. (previously iGEMMgen)☆37Jul 30, 2025Updated 7 months ago
- TiledKernel is a code generation library based on macro kernels and memory hierarchy graph data structure.☆19May 12, 2024Updated last year
- ☆20Sep 28, 2024Updated last year
- Multiplication using AVX512 and AVX512IFMA instructions☆23Nov 9, 2015Updated 10 years ago
- ICML2017 MEC: Memory-efficient Convolution for Deep Neural Network C++实现(非官方)☆17Apr 9, 2019Updated 6 years ago
- TradR.fun is a free and opensource cryptocurrency price signal prediction application.☆18Feb 15, 2026Updated 2 weeks ago
- OriGen: Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection(ICCAD 2024)☆29Oct 20, 2024Updated last year
- An object detection codebase based on MegEngine.☆28Dec 14, 2022Updated 3 years ago
- The Next-gen Language & Compiler Powering Efficient Hardware Design☆36Jan 16, 2025Updated last year
- Chameleon: Adaptive Code Optimization for Expedited Deep Neural Network Compilation☆27Nov 7, 2019Updated 6 years ago
- Subpart source code of of deepcore v0.7☆27Jun 28, 2020Updated 5 years ago
- TileFlow is a performance analysis tool based on Timeloop for fusion dataflows☆66Apr 12, 2024Updated last year
- cuASR: CUDA Algebra for Semirings☆44Aug 22, 2022Updated 3 years ago
- A high-efficiency system-on-chip for floating-point compute workloads.☆44Jan 13, 2025Updated last year
- ☆15Oct 13, 2024Updated last year
- Bilinear Pairings Components Library for Delphi☆12Dec 19, 2018Updated 7 years ago
- Kinematic and dynamic models of continuum and articulated soft robots.☆15Nov 22, 2025Updated 3 months ago
- A multi-headed dumphfdl receiver for Web-888 and other SDR devices☆15Jan 17, 2026Updated last month
- H2-LLM: Hardware-Dataflow Co-Exploration for Heterogeneous Hybrid-Bonding-based Low-Batch LLM Inference☆89Apr 26, 2025Updated 10 months ago
- A language and compiler for irregular tensor programs.☆152Nov 29, 2024Updated last year
- MATLAB function to fill an area with hatching ~~or speckling~~☆11Mar 4, 2018Updated 7 years ago
- Code for the paper "Faster Neural Network Training with Approximate Tensor Operations"☆10Oct 23, 2021Updated 4 years ago
- An artificial matrix generator in C☆12Feb 16, 2023Updated 3 years ago
- An open-source command line interface for linting your Ethereum 2.0 validator set up☆14May 17, 2021Updated 4 years ago
- Benchmarks of all public available SNARK/STARK keccak circuits☆13Oct 1, 2023Updated 2 years ago
- Reconnaître la marque/modèle des véhicules dans une image☆11Mar 31, 2023Updated 2 years ago
- BERT Sentiment Classification on the IMDb Large Movie Review Dataset.☆16Sep 8, 2022Updated 3 years ago
- ☆14Apr 14, 2025Updated 10 months ago
- Prototype of fraud proofs.☆12Feb 13, 2022Updated 4 years ago
- Artifact for PPoPP20 "Understanding and Bridging the Gaps in Current GNN Performance Optimizations"☆41Nov 16, 2021Updated 4 years ago