If multiplying two matrices, A*B = C
, either of them can have large number of values which is negligible, ie near zero. There isn't really any block structure to zeroes.
What solutions do I have to reduce operations?
I thought of primarily trying to permute matrices to get into block-zero structure but that may by itself be O(3)
cost. CRS or CCS doesn't seem to have many ready-to-use dgemm equivalents.