Similarity of two galaxies calculator
airacuda ~ 85 - 95k megapairs/s (GTX 1070) barracuda ~ 95 - 110k megapairs/s (GTX 1070)
Implemented with the usage of blocking of the array and sum reduction. Usage of proper memory, and proper access to shared memory to avoid bank conflicts is required.
Run configuration
make clean -- deletes bin files
make -- compiles program
make run -- runs program