spectre900 / Parallel-Strassen-Algorithm
Parallelizing Strassen’s matrix multiplication using OpenMP, MPI and CUDA.
☆15Updated 3 years ago
Alternatives and similar repositories for Parallel-Strassen-Algorithm:
Users that are interested in Parallel-Strassen-Algorithm are comparing it to the libraries listed below
- 中国科学院大学高级计算机体系结构课程作业:使用OpenROAD-flow完成RTL到GDS全流程☆24Updated 4 years ago
- 使用OpenMP及MPI完成的几个并行程序设计小实验:矩阵相乘、矩阵LU分解、文档分类中的文档向量过程☆28Updated 3 years ago
- 基于龙芯 OpenMIPS 实现一个具有 89 条指令的五级流水 CPU,使用 Verilog 语言,使用哈佛结构,包括逻辑移位指令、乘除法指令、加载存储指令、转移指令、协处理器访问指令以及异常相关在内的共89条指令。能够处理数据相关,包含流水线暂停以及延迟槽☆20Updated 4 years ago
- A simple SAT solver based on the CDCL algorithm☆19Updated 5 years ago
- How to optimize sgemm in single-thread ARM cpu, mutli-threads ARM cpu and Nvidia gpu☆20Updated 3 years ago
- C++ implement a simple CNN framework to train mnist data. Done!☆10Updated 2 years ago
- 高级计算机体系结构2020,吴俊敏老师,中科大研究生课程☆59Updated 11 months ago
- 龙芯官方给出的MIPS源码与我个人优化文件结构之后的源码☆14Updated 5 years ago
- Personal Notes for Learning HPC & Parallel Computation [Active Adding New Content]☆61Updated 2 years ago
- Implementation of Parallel Breadth-First Search on Distributed Memory Systems☆11Updated 9 years ago
- ACM、OI、OJ代码模板☆103Updated 6 months ago
- Documentation for HPC course☆140Updated 3 weeks ago
- Parallel sparse direct solver for circuit simulation☆41Updated 2 years ago
- A Method for efficiently processing SpMV using SIMD and load balancing☆16Updated 2 years ago
- A Convolutional Neural Network Accelerator, which increases the process of convolution calculation. Based on Xilinx HLS design suite.☆12Updated 3 years ago
- Implementation of parallel Breadth First Algorithm for graph traversal using CUDA and C++ language.☆31Updated 5 years ago
- Several common methods of matrix multiplication are implemented on CPU and Nvidia GPU using C++11 and CUDA.☆14Updated last year
- ☆40Updated last week
- Three Matrix-Multiplication-Algorithms: Generate Algorithm, Strassen Algorithm and Coppersmith-Winograd Algorithm☆27Updated 3 years ago
- SpV8 is a SpMV kernel written in AVX-512. Artifact for our SpV8 paper @ DAC '21.☆27Updated 3 years ago
- Split large FIRRTL into separated modules for incremental compilation.☆10Updated 3 years ago
- Introduction to Computer Systems (II), Spring 2021☆48Updated 3 years ago
- Chongqing University 2020 NSCSCC☆28Updated 4 years ago
- parallelProgramingProject ========================= 《高级并行程序设计》课程报告代码附录 ----------------------------------- 目录结构:<br> cannon/ cannon算法…☆36Updated 10 years ago
- A Thread-Level Synchronization-Free Sparse Triangular Solve on GPUs☆56Updated 3 years ago
- 计算机体系结构研讨课 2020年秋季 UCAS 《CPU 设计实战》 Lab3-Lab9☆26Updated 3 years ago
- Finite Field Operations on GPGPU☆14Updated last year
- Online judge server for Verilog | verilogoj.ustc.edu.cn☆77Updated 7 months ago
- Code base and slides for ECE408:Applied Parallel Programming On GPU.☆119Updated 3 years ago
- 操作系统 2019 ucore labs☆46Updated 5 years ago