Skip to content

Optimization_4x4_5

Jianyu Huang edited this page Aug 11, 2016 · 4 revisions

Copy the contents of file MMult_4x4_4.c into a file named MMult_4x4_5.c and change the contents:

  • Change the first lines in the makefile to
OLD  := MMult_4x4_4
NEW  := MMult_4x4_5
  • make run
octave:3> PlotAll        % this will create the plot

This time the performance graph will look something like

Now we see a performance benefit, when the matrices becomes large, since the data is reused more after being brought into the registers. On the left is the comparison of 4x4_4 and 4x4_5. On the right the comparison of 1x4_5 (computing four elements of C at a time) and 4x4_5 (computing sixteen elements of C at a time).

Clone this wiki locally