GitHub - schuangs/HPL-AI: An implement of HPL-AI Mixed-Precision Benchmark based on hpl-2.3

schuangs / HPL-AI Public

forked from wu-kan/HPL-AI

Notifications You must be signed in to change notification settings
Fork 0
Star 0

An implement of HPL-AI Mixed-Precision Benchmark based on hpl-2.3

wu-kan.cn/_posts/2021-03-14-hpl-ai/

MIT license

0 stars 5 forks Branches Tags Activity

Star

Notifications

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 116 Commits
include		include
src		src
testing		testing
AUTHORS		AUTHORS
COPYING		COPYING
COPYRIGHT		COPYRIGHT
ChangeLog		ChangeLog
Makefile.am		Makefile.am
NEWS		NEWS
README		README
configure.ac		configure.ac

Repository files navigation

#!/bin/bash
##############################################################
#
# HPL-AI Mixed-Precision Benchmark v2.3a  --  March 14, 2021
#
##############################################################
#
# Check out <https://wu-kan.cn/_posts/2021-03-14-HPL-AI/> for
# the full document and the latest information.
#
##############################################################
#
# A quick start to build and run a few tests: ./README

# First the following softwares are required on your system:
# C&C++ compiler, autoconf, autoconf-archive, automake, mpi,
# blas, blaspp
#
# You can easily install and load the requirements via spack
# <https://github.com/spack/spack/releases/tag/v0.16.1>.
#
# I just tested with the followings, while other versions or
# libraries might work as well:

spack unload -a
spack load [email protected]
spack load [email protected]%[email protected]
spack load [email protected]%[email protected]
spack load [email protected]%[email protected]
spack load [email protected]%[email protected]~cxx~cxx_exceptions
spack load [email protected]%[email protected]+openmp~cuda \
    ^[email protected]%[email protected] threads=openmp

# Then boostrap the configuration files by typing:

autoreconf -ivf

# The user is given the opportunity to compile the software
# with some specific compile options:
#
# CPPFLAGS=" -DHPLAI_T_AFLOAT=double "
#
# CPPFLAGS=" -DHPLAI_DEVICE_BLASPP_GEMM "
#
# CPPFLAGS=" -DHPLAI_DEVICE_BLASPP_TRSM "
#
# CPPFLAGS=" -DHPLAI_GEN_BLASPP_GEMM "
#
# CPPFLAGS=" -DHPLAI_GEN_BLASPP_TRSM "
#
# CPPFLAGS=" -DHPLAI_GEN_BLASPP_TRSV "
#
# CPPFLAGS=" -DHPLAI_PMAT_REGEN "
#
# CPPFLAGS=" -DHPL_COPY_L "
#
# CPPFLAGS=" -DHPL_CALL_CBLAS "
#
# CPPFLAGS=" -DHPL_CALL_VSIPL "
# (deperated)
#
# CPPFLAGS=" -DHPL_DETAILED_TIMING "
# (deperated)
#
# To configure the build and prepare for compilation run:

./configure

# Note: to use device blaspp routines, you may need to enable
# CUDA support of blaspp:
#
# spack load [email protected]%[email protected]+openmp+cuda
#
# and then:
#
# ./configure \
#     LIBS=" -lcudart -lcublas " \
#     CPPFLAGS=" -DBLASPP_WITH_CUBLAS \
#         -DHPLAI_DEVICE_BLASPP_GEMM \
#         -DHPLAI_DEVICE_BLASPP_TRSM "

# Then compile:

make -j

# The configuration file must be called HPL.dat.
#
# You can copy the configuration file from the original HPL,
# or create a configuration file anew.
#
# Most of the performance parameters can be tuned.

if true; then
    cp testing/ptest/HPL.dat HPL.dat
else
    cat >HPL.dat <<EOF
HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out      output file name (if any)
6            device out (6=stdout,7=stderr,file)
1            # of problems sizes (N) 
16384 143360 Ns  
1            # of NBs 
384 192 256  NBs 
1            PMAP process mapping (0=Row-,1=Column-major)
1            # of process grids (P x Q)
2 1 4        Ps  
2 4 1        Qs  
16.0         threshold
1            # of panel fact
2 1 0        PFACTs (0=left, 1=Crout, 2=Right)
1            # of recursive stopping criterium
2            NBMINs (>= 1)
1            # of panels in recursion
2            NDIVs
1            # of recursive panel fact.
2 1 0        RFACTs (0=left, 1=Crout, 2=Right)
1            # of broadcast
0            BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1            # of lookahead depth
0            DEPTHs (>=0)
0            SWAP (0=bin-exch,1=long,2=mix)
1            swapping threshold
1            L1 in (0=transposed,1=no-transposed) form
1            U  in (0=transposed,1=no-transposed) form
0            Equilibration (0=no,1=yes)
8            memory alignment in double (> 0)
EOF
fi

# Finally run and compare with the original hpl-2.3:

mpiexec -n 4 -x OMP_NUM_THREADS=2 testing/xhpl
mpiexec -n 4 -x OMP_NUM_THREADS=2 testing/xhplai

# If you download HPL-AI via git, you can clean the builds by:

git clean -d -f -q -x

##############################################################
#
# The newest version of HPL-AI is available at
# <https://github.com/wu-kan/HPL-AI/releases>
#
##############################################################
#
# Bugs are tracked at
# <https://github.com/wu-kan/HPL-AI/issues>
#
##############################################################
#
# The souce code of HPL-AI is licensed under `COPYING`.
#
# The souce code of hpl-2.3 is licensed under `COPYRIGHT`.
#
##############################################################