Hi!
I am struggling to get my sparse matrix multiplication algorithm to any speed close to realistic when raising it above 1000x1000, without any parallel programming.
My algorithm works by going through each value in the multiplying matrix, and then for every multiplying value, it has to go through the entire base matrix in order to increase the result matrix entry which has the correct column and row. However, this makes it so that I have 4 for loops, which I feel is excessive? Does anyone have any advice?
My algorithm steps are as follows.
1. Iterate through all rows of the multiplying indice matrix
2. Iterate through the current row, getting each indice
3. With this Indice, iterate through each row of the base indice matrix,
4. For each indice of the base indice matrix, if it matches the multiplying indice matrix add it to result with the given values.
This requires 4 for loops! thus is very slow. Upon seeing an earlier thread, it seems with 1 thread we should be able to reach 10000.
Thanks for anything :)