Skip to end of metadata
Go to start of metadata

Kingspeak User Guide 

Contents

Kingspeak Cluster Hardware (General) Overview

  • 32 Dual Socket-Eight Core Nodes and 4 Dual-Socket-Ten Core Nodes (596 total cores) 
  • Intel Xeon (Sandybridge/Ivybridge E5-2670) processors
  • 2.6 Ghz speed with AVX support:
  • 64 Gbytes memory per node (4 Gbytes per processor core)
  • Mellanox FDR Infiniband interconnect
  • Gigabit Ethernet interconnect for management
  • 2 interactive nodes 

NFS Home Directory

Your home directory, which is an NFS mounted file system, is one choice for I/O. This space carries the worst statistical performance in terms of I/O speed. This space is visible to all nodes on the clusters through an auto-mounting system.

NFS Scratch (/scratch/kingspeak/serial)

Kingspeak has access to another NFS filesystem: /scratch/kingspeak/serial. This file system has 175 TB disk capacity.  It is attached to the Infiniband network to obtain a larger potential network bandwidth. Its space is seen (read and write access) on all of Kingspeak's interactive and compute nodes. However it is still a shared resource and may therefore perform slower when it is subjected to significant user load. Users should test their applications performance to see if they experience any unexpected performance issues within this space. This file system has a scrub policy of files older than 60 days being deleted,

Local Disk (/scratch/local)

The local scratch space is a storage space unique to each individual node. The local scratch space is cleaned aggressively, with files older than 1 week being scrubbed. It can be accessed on each node through /scratch/local. This space will be one of the fastest, but certainly not the largest (337 GB). Users must remove all their files from /scratch/local at the end of their calculation.

It is a good idea to make flows from one storage system to another when you are running jobs.  At the start of the batch job the data files should be copied from the home directory to the scratch space, followed by a copy of the output back to the user's home directory at the end of the run. 

It is important to keep in mind that ALL users must remove excess files on their own. Preferably this can be done within the user's batch job when he/she has finished the computation. Leaving files in any scratch space creates an impediment to other users who are trying to run their own jobs. Simply delete all extra files from any space other than your home directory when it is not being used immediately.

For all policies regard CHPC file storage see 3.1 File Storage Policies.

Important Differences from Other CHPC Clusters – NEW!

  • Note the change in the naming convention in the paths to the cluster specifc applications and the moab and torque/PBS commands in that this cluster is kingspeak.peaks not kingspeak.arches
  • Kingspeak is running RHEL 6 OS versus the RHEL 5 being run on the other CHPC clusters.  Ember will be upgraded to this OS in the near future, but the other clusters will not
    • this has necessitated the change the versioning system with applications in /uufs/chpc.utah.edu/sys/pkg having std if the build runs on both the rhel5 and the rhel6 
  • Kingspeak is running a newer version of the MOAB batch scheduler; again only  Ember will be moved to this version.
    • You can no longer  qsub and query the batch of the other clusters from kingspeak, or the batch of kingspeak from any of the other clusters with this new version
  • All users must update their .tcshrc/.bashrc before starting work on kingspeak. This new version of the .tcshrc/.bashrc will work on all the other clusters.
  • The new scratch name includes "kingspeak" — /scratch/kingspeak/serial — as a way to associate this scratch with the cluster.  Right now access to this new scratch file system is restricted to kingspeak and the older scratch systems are not currently mounted on kingspeak.
  • The general pool of resources has nodes of different core counts – see section below on ways to use this mixed resource.

FAQ Section – NEW!

NOTE – we will add to this section as we get questions from users

  1. Why is my mpi job giving the error:  
    /uufs/chpc.utah.edu/sys/pkg/intel/ics/composer_xe_2013.3.163/mpirt/bin/intel64/mpirun: line 96: 
    /uufs/chpc.utah.edu/sys/pkg/intel/ics/composer_xe_2013.3.163/mpirt/bin/intel64/mpivars.sh:  No such file or directory ?

    When using the intel compiler and mpich2, mvapich or openmpi, you must do the source to  set up the proper mpi environment AFTER the source setting the compiler environment, e.g.,

                      source /uufs/chpc.utah.edu/sys/pkg/intel/ics/bin/compilervars.csh intel64

                      source /uufs/kingspeak.peaks/sys/pkg/openmpi/std_intel/etc/openmpi.csh

  

Kingspeak Cluster Usage

CHPC resources are available to qualified faculty, students (under faculty supervision), and researchers from any Utah institution of higher education. Users can request accounts for CHPC computer systems by filling out an account request form. This can be found by following this link: account request form.

Users requiring priority on their jobs may apply for an allocation of Service Units (SUs) per quarter need to send a brief proposal, using the the allocation form found at  Allocation form

Kingspeak Cluster Access

The Kingspeak cluster can be accessed via ssh (secure shell) at the following address:

  • kingspeak.chpc.utah.edu

All CHPC machines mount the same user home directories. This means that the user files on Kingspeak will be exactly the same as the ones on other CHPC clusters. The advantage is obvious: users do not need to copy files between machines. However, users must be aware that they run the correct executables. CHPC maintained applications with executables suitable for use on all clusters are kept in  /uufs/chpc.utah.edu/sys/pkg, whereas cluster specific executables (MPI-based applications built using the mpi optimized for the specific cluster infiniband) for kingspeak will be found in /uufs/kingspeak.peaks/sys/pkg.

Another complication associated with the use of a single home directory across all systems is the issue of the shell initialization scripts (which are executed before each login). Environmental variables (especially paths to applications) vary from cluster to cluster.
The CHPC provides a login script which detects the machine to which one is being logged in. This login script enables users to switch on/off (machine-specific) initializations for packages which have been installed on the cluster (e.g. Gaussian, Matlab, TotalView,..), to set cluster-dependent MPI-defaults.

At the present time, the CHPC supports two types of shells: tcsh and bash. Tcsh shell users need to select the .tcshrc login script. Users whose shell is bash need the .bashrc file to log in.

Either click here to download the default .tcshrc login script (CHPC systems)
or click here to download the default .bashrc login script (CHPC systems)

The first part of the .tcshrc/.bashrc script determines the machine type on which one is logged in based on a few parameters: the machine's operating system, its IP address or its UUFSCELL variable (defined at the system level). Upon each login the address list with the CHPC Linux machines is retrieved from the CHPC webserver. In case of a succesful retrieval the address list is stored in the file linuxips.csh (tcsh) or linuxips.sh (bash). If the CHPC webserver is non-responsive, the .tcshrc/.bashrc script uses the address list from previous sessions, or, in the absence of the latter, issues a warning.
The .tcshrc/.bashrc script looks up the machine's IP address and performs a host specific initialization.

Below you will find a slice from a .tcshrc login script (example!) which performs the initialization of the user's environment on the Kingspeak cluster.

In the example above (and also in general) all words following the # sign are comments. Specific package initializations can be turned off by inserting a comment sign (#) at the start of a line. Please do not comment out those lines from the login script that don't start with the source command.

WE recommend that users do not customize this file beyond turning on/off the different package initializations; instead they should put any customizations in a file called .aliases in their home directory and uncomment the line "source .alases" found at the bottom of the CHPC provided scripts.

The initialization on all the other CHPC clusters works in a similar fashion. Although the bash syntax differs from the tcsh syntax, the bash script performs the same operations as the code in the tcsh script.

After the numerous host-specific initializations, the last section of the login script performs a global initialization (i.e. identical for all machines). In this section one can set various commands such as aliases, prompt format,...

If the user is also mounting the CHPC home directory on his/her own desktop (as most people do), then we recommend to set the variable MYMACHINE to the IP address of the user's own machine. The address of one's own machine can be found by issuing the command:

For example:

If one uses the tcsh shell, then the MYMACHINE variable (for the example above) in the .tcshrc file is set to:

If the bash shell is used the MYMACHINE variable (for the example above) in the .bashrc file becomes:

Find the line containing $MYIP == $MYMACHINE in the login script, and add as many customizations as you please in the lines following the $MYIP == $MYMACHINE statement.

Using the Batch System

The batch implementation on all CHPC systems includes PBS, a resource manager (Torque), and a scheduler (Moab).

Any process which requires more than 15 minutes run time needs to be submitted through the batch system.

Concerning wall time, we have a hard limit of maximum 72 hours for "general" kingspeak jobs. If you find you need longer than this, please contact CHPC.  Users without any allocation (see Cluster Usage Section above) or those that have used all of their allocation can still run, but as "freecycle" mode. However, "freecycle" jobs are preemptable, i.e., they are be subjected to termination IF a job with allocation needs the resources. We suggest to start with some small runs to gauge how long the production runs will take. As a rule of thumb, consider a wall time which is 10-15% larger than the actual run time. If you specify a shorter wall time you face the risk of your job being killed before finishing. If you set a wall time which is too large, you may face a longer waiting time in the queue.

Runs in the batch system generally pass though the following steps:

  1. The creation of a batch script
  2. The job's submission to the batch system
  3. Checking the job's status

For more details on the  batch policies, please visit the CHPC Batch Policies web page

The creation of a batch script on the Kingspeak cluster

A shell script is a bundle of shell commands which are fed one after another to a shell (bash, tcsh,..). As soon as the first command has successfully finished, the second command is executed. This process continues until either an error occurs or the complete array of individual shell commands has been executed. A batch script is a shell script which defines the tasks a particular job has to execute on a cluster.

Below this paragraph a batch script example for running in PBS on the Kingspeak cluster is shown. The lines at top of the file all begin with #PBS which are interpreted by the shell as comments, but give options to PBS and Moab.

Example PBS Script for Kingspeak:

Note that in the example above we don't specify the queue. For more info on qos=long, please see below.

The #PBS -S statement (1st line) indicates the shell that will be used to interpret the script.

The #PBS -A line, while optional is recommended. Your account is typically your PI's last name and refers to the allocation account to be used for the job,

The next  line (#PBS -l) specifies the resources that you need to run your job. Your requests have to be consistent with the existing CHPC policies and based on what is available. See the next section for options for this line on kingspeak.

In the example above 32 cores or 2 nodes with each 16 processors (cores) per node (ppn stands for processors per node) are requested for 24 hours (wall time). At the present time all Kingspeak nodes possess 16 cores per node. We recommend that you always specify the number of nodes and the number of cores per node.

The third and the fourth line, which are optional, specify to whom and when the server should send an email. From line three of the example above (#PBS -m abe) we learn that the server will send an email when the job aborts (a), when the job begins (b) and when the job ends (e). In line four (#PBS -M) you certainly have to change the email address into your own email address.

The PBS command (#PBS -N) followed by the name of the job will allow you to tag a name to the job id. This property will allow you to improve the visibility of your job in the queue.

In the case of any unscheduled downtime (such as power outages, instances where the cooling fails and the systems are taken down quickly) any jobs actively running in the batch queue will be requeued and restarted from the beginning of the job. Note that this will not happen for scheduled down times as the queues are drained before the system is taken down.

If this requeueing/restarting from the beginning behavior is NOT acceptable (e.g. if you want to check it first before restarting), you can add the following option to your PBS script:

#PBS -r n

Please have a look options beneath for the available PBS flags.
Finally, in this example we suggest of using the serial scratch space /scratch/kingspeak/serial, which is visible to all nodes. However, it is a shared resource which is not being backed up. We advise users to copy important data to their home directory at the end of a job.

Note that we are running the executable from the working_directory, but reading the input files from and writing the output files into the /scratch/kingspeak/serial.

Resource Specification Options

The existence of nodes with different core counts gives rise to some additional ways to specify the nodes being requested.  Below are the options:

  1. Using '#PBS -l nodes=X:ppn=y' will give you X nodes, each having Y cores.  This is the best option if you want all the same type and a set number of core.  In this case the $PBS_NODEFILE has one entry per core.
  2. Yet another option that can be used on any cluster if you do not want to use all of the cores (fore example, if memory is a limitation or you are doing bench marking on scaling within a node) then you have two options. In both cases, the $PBS_NODEFILE is correct will have Y entries of each node received.  Note that in this case your job is still charged as if it was using the entire node as we do  not allow node sharing.
    1.  '#PBS -l nodes=X:ppn=Y -W x=nmatchpolicy:exactnode
    2.  '#PBS -l procs=X,tpn=Y' 
  3. If you want X  nodes but you do not care if they are X 16 core nodes or X 20 core nodes then you can use the "task resource list" or trl option instead of requesting nodes.  This option also works if you have a narrow range on the number of cores. In this case, as in case 1, the $PBS_NODEFILE has the one entry per core. In order to get the number of cores for your mpirun -np flag you can query the number of lines in the $PBS_NODEFILE, e.g.,  set NCORE= `more $PBS_NODEFILE | wc -l`  Here are a couple of examples of the format of the trl option:
    1. For one node of either core count currently on kingspeak:  '#PBS -l trl=16:20' 
    2. For two nodes of either core count, but both the same core count:  '#PBS -l trl=32:40'  – This can be extrapolated to X nodes with all being the same core count.
    3. For two nodes of either core count, where it does not matter if they are the same:  '#PBS -l trl=32:36:40'  – This corresponds to 2-16 core, 1-16 core and 1-20 core, or 2-20 core nodes as being acceptable for the job.
  4. For  X nodes of any size you can use '#PBS -l nodes=X -W x=nmatchpolicy:exactnode'.  Note that there is no ppn being set.  This will give you X nodes and your $PBS_NODEFILE will have one entry per node.  If you need this file to have one entry per core, as needed for mpi, then you can use the following script (first entry is for tcsh, second for bash) to create this file from the existing $PBS_NODEFILE.  As above, once you have this file you can look at the line count to determine the number of cores.

    # Description: This script takes the existing PBS_NODEFILE
    # and converts it to a form that will allow mpirun to use all
    # cores on all nodes allocated for the job. 
    # The new name of the node file is $pbs_new
    #
    # Usage: In your PBS batch script either add the line
    #        source <path-to-script>/pbsmangle.csh
    #       OR place the following directly in your script   
    #
    #################################################################
    #inform the user that the script is in effect
    echo "Running the mixed node mangling script...."
    #setup the new PBS_NODEFILE
    setenv PBS_OLDNODEFILE $PBS_NODEFILE
    set pbs_new="$HOME/$PBS_JOBID.mangled"
    rm -f $pbs_new
    touch $pbs_new
    
    #play it safe; assume that the PBS_NODEFILE is not formed properly
    cat $PBS_NODEFILE | sort | uniq > /tmp/$PBS_JOBID.old
    #for each line in $PBS_NODEFILE
    foreach i ( `cat /tmp/$PBS_JOBID.old` )
        #get the info on the node, find the np line, then strip 'np =' and whitespace
        set npcount=`pbsnodes $i | grep np | sed 's/np =//' | sed 's/^[ \t]*//'`
        foreach j ( `seq 1 $npcount` )
            echo $i >> $pbs_new
        end
    end
    #cleanup
    rm -f /tmp/$PBS_JOBID.old
    
    #output useful things to the user for debugging
    set count=`cat $pbs_new | wc -l`
    set ncount=`cat $PBS_OLDNODEFILE | wc -l`
    echo "Script done!"
    echo "The file $pbs_new now contains references to $count cores on $ncount nodes and should be used as your machinefile."
    echo "If this is not what you expect, either remove the reference to this"
    echo "script in your batch script or adjust your PBS paramters."
    echo "The old PBS_NODEFILE is now assigned to PBS_OLDNODEFILE."
    
    
    # Description: This script takes the existing PBS_NODEFILE
    # and converts it to a form that will allow mpirun to use all
    # cores on all nodes allocated for the job. 
    # The new name of the nodefile is $pbs_new.
    #
    # Usage: In your PBS batch script add the line
    #        source <path-to-script>/pbsmangle.sh
    #      OR add the following directly in your script
    #
    #################################################################
    #inform the user that the script is in effect
    echo -ne "Running the mixed node mangling script...."
    #setup the new PBS_NODEFILE
    export PBS_OLDNODEFILE=$PBS_NODEFILE
    pbs_new=$HOME/$PBS_JOBID.mangled
    rm -f $pbs_new
    touch $pbs_new
    
    #play it safe; assume that the PBS_NODEFILE is not formed properly
    cat $PBS_NODEFILE | sort | uniq > /tmp/$PBS_JOBID.old
    #for each line in $PBS_NODEFILE
    for i in $(cat /tmp/$PBS_JOBID.old)
      do
        #get the info on the node, find the np line, then strip 'np =' and whitespace
        npcount=$(pbsnodes $i | grep np | sed 's/np =//' | sed 's/^[ \t]*//')
        for j in $(seq 1 $npcount)
          do
            echo $i >> $pbs_new
          done
      done
    # do some cleanup
    rm -f /tmp/$PBS_JOBID.old
    
    #output useful things to the user for debugging
    count=$(cat $HOME/$PBS_JOBID.mangled | wc -l)
    ncount=$(cat $PBS_OLDNODEFILE | wc -l)
    echo -e "Script done!"
    echo "The file $HOME/$PBS_JOBID.mangled now contains references to $count cores on $ncount nodes and should be used as your machinefile."
    echo -e "If this is not what you expect, either remove the reference to this"
    echo -e "script in your batch script or adjust your PBS paramters."
    echo -e "The old PBS_NODEFILE is now assigned to PBS_OLDNODEFILE."
    
    

Job Submission on Kingspeak

In order to submit a job on kingspeak one has to login first into a kingspeak interactive node. Note that this is a change from the way job submission has worked in the past on our other clusters, where you can use the full path to the command and look at the queue at all clusters from an interactive node of any one of the clusters; this is a change with the new version of Moab. Then the job submission is done with the qsub command in PBS or the runjob command in Moab followed by the name of the script.

For example, to submit a script named pbsjob, just type:


PBS sets and expects a number of arguments in a PBS script. For more information on PBS commands and their arguments, please type:

where PBS_COMMAND stands for a PBS command such as e.g. qsub. Click on this link to see a few additional PBS commands.

Checking the status of your job

To check the status of your job, use the "showq" command in Moab.

  • showq

Additional  PBS and Moab commands are discussed below. 

PBS Batch Script Options

  • -a date_time.  Declares the time after which the job is eligible for execution. The date_time element is in the form: [[[[CC]YY]MM]DD]hhmm[.S].
  • -e path.  Defines the path to be used for the standard error stream of the batch job. The path is of the form: [hostname:]path_name.
  • -h.  Specifies that a user hold will be applied to the job at submission time.
  • -I.  Declares that the job is to be run "interactively". The job will be queued and scheduled as PBS batch job, but when executed the standard input, output, and error streams of the job will be connected through qsub to the terminal session in which qsub is running.
  • -j join.  Declares if the standard error stream of the job will be merged with the standard ouput stream. The joinargument is one of the following:
    • oe-  Directs the two streams as standard output.
    • eo-  Directs the two streams as standard error.
    • n-  Any two streams will be separate_(Default)_.
  • -l resource_list.  Defines the resources that are required by the job and establishes a limit on the amount of resources that can be consumed. Users will want to specify the walltime resource, and if they wish to run a parallel job, the ncpus resource.
  • -m mail_options.  Conditions under which the server will send a mail message about the job. The options are:
    • n: No mail ever sent
    • a (default): When the job aborts
    • b: When the job begins
    • e: When the job ends
  • -M user_list.  Declares the list of e-mail addresses to whom mail is sent. If unspecified it defaults to userid@host from where the job was submitted. You will most likely want to set this option.
  • -N name.  Declares a name for the job.
  • -o path.  Defines the path to be used for the standard output.[hostname:]path_name.
  • -q destination.  The destination is the queue.
  • -S path_list.  Declares the shell that interprets the job script. If not specified it will use the user's login shell.
  • -v variable_list.  Expands the list of environment variables which are exported to the job. The variable list is a comma-separated list of strings of the form variable or variable=value.
  • -V.  Declares that all environment variables in the qsub command's environment are to be exported to the batch job.

PBS User Commands

For any of the commands listed below you may do a "man command" for syntax and detailed information.

Frequently used PBS user commands:

  • qsub Submits a job to the PBS queuing system. Please see qsub Options below.
  • qdel Deletes a PBS job from the queue.
  • qstat Shows status of PBS batch jobs.

Less Frequently-Used PBS User Commands:

  • qalter. Modifies the attributes of a job.
  • qhold. Requests that the PBS server place a hold on a job.
  • qmove. Removes a job from the queue in which it resides and places the job in another queue.
  • qmsg. Sends a message to a PBS batch job. To send a message to a job is to write a message string into one or more of the job's output files.
  • qorder. Exchanges the order of two PBS batch jobs within a queue.
  • qrerun. Reruns a PBS batch job.
  • qrls. Releases a hold on a PBS batch job.
  • qselect. Lists the job identifier of those jobs which meet certain selection criteria.
  • qsig. Requests that a signal be sent to the session leader of a batch job.

Moab Scheduler User Commands

  • showq - displays jobs which are running, active, idling and non-queued.
  • showbf - shows backfill.
  • showstart - shows startime.
  • checkjob - displays status of a job.
  • showres - shows active reservations.

Each command accepts -h flag that displays help.

PBS and Moab commands are located in "/uufs/kingspeak.peaks/sys/bin". Please see the Moab Scheduler documentation for more information.

Available Compilers

C/C++

The Kingspeak cluster offers several compilers. The GNU Compiler Suite includes the ANSI C, C++ compilers. Its installed version on the RedHat EL 6 OS is 4.4.7. Newer versions of the GNU compilers are also maintained (currently we have /uufs/chpc.utah.edu/sys/pkg/gcc/4.7.2_rh6).

In addition to the GNU compilers, we offer two commercial compiler suites: the Intel and the Portland Group Compiler (PGI) suites.

The Intel compilers generally provide superior performance. They include C, C++.

The Portland Group Compiler Suite is another good compiler distribution which has compilers for the aforementioned languages.

GNU compilers

The GNU compilers are located in the default area, i.e. compilers in /usr/bin, libraries in /usr/lib or /usr/lib64, header files in /usr/include, etc. The C compiler is called gcc; the C++ compiler g++.    Below is a very basic example of how to generate an executable (exe) from a C source file (source.c) or a C++ source file (source.cc) :

In order to use the version of the GNU compilers (presuming that you have the tcsh shell as your default shell) in /uufs/chpc.utah.edu/sys/pkg/gcc (currently 4.7.2):

Intel compilers

The whole suite of the Intel compilers, debugger,etc are located in the directory:

Before users can invoke any of these Intel compilers/debugger they have to source the file compilervars.sh (bash) or compilervars.csh (tcsh) to set the necessary environmental variables (note - this is done in the chpc provided .tchsrc and .bashrc files):

The C compiler is invoked by the icc command; the C++ compiler by the icpc command. If you would like to see the list of available flags, please use the man pages by typing man icc (C), man icpc (C++).

To find the compiler's version, use the flag -v, i.e. type icc -v (C), icpc -v (C++).

We generally recommend the flag ' -xAVX -O3 ' for superior performance. Very aggressive optimization can be achieved using '-xAVX -fast'. Beware that the latter option should be handled with caution.

Consult the icc/icpc man page for more details.

For more information on the C/C++ compilers, visit the following site:

PGI compilers

The latest version of the Portland Group compilers for use on kingspeak are located in the directory:

In order to use the compiler, users have to source shell the script pgi.sh (bash)/pgi.csh (tcsh) that sets paths and some other environmental variables:

The C compiler is invoked as pgcc; the C++ compiler as pgCC.

To find the compiler's version, use the flag -V, i.e. pgcc -V. For a list of all the available flags, use the man pages (e.g. man pgcc).

In order to obtain good performance we generally recommend the flag -fastsse.

For more information on the PGI C and C++ compilers and PGI Tools (Debugger and Profiler), please have a look at:

The Portland Group advises on its website the compiler options for a few well-known codes (ATLAS, FFTW, GotoBLAS,..)

Fortran

The Kingspeak cluster provides several Fortran compilers. The GNU Compiler Suite includes the Fortran compiler. The installed version on the RedHat EL 6 OS is 4.4.7 and newer versions can be found with the gcc install mentioned above.

In addition to the GNU Fortran compilers, we offer two commercial compiler suites which contain Fortran compilers: the Intel and the Portland Group Compiler (PGI) suites.

The Intel Fortran compiler generally provides superior performance.

The Portland Group Compiler Suite is another good compiler distribution.

GNU compilers

The GNU compiler gfortran is located in the default area, i.e. the compiler binary in /usr/bin, libraries in /usr/lib or /usr/lib64. The gfortran compiler supports all the options supported by the gcc compiler. In the next line we give a very basic example of how to compile an F90 source file (source.f90).

In order to use the version of the GNU compilers in /uufs/chpc.utah.edu/sys/pkg/gcc:

Intel compilers

The whole suite of the Intel compilers, debugger, ... are located in the directory:

Before users can invoke any of these Intel compilers/debugger they have to source the file compilervars.sh (bash) or compilervars.csh (tcsh):

The Fortran compiler is invoked by the ifort command. For a list of the available flags, please use the man pages by typing man ifort.

To find the compiler's version, use the flag -v, i.e. ifort -v.

We generally recommend the flag ' -xAVX -O3 ' for superior performance. Very aggressive optimization can be achieved using '-xAVX -fast'. Beware that the latter option should be handled with caution.

Consult the ifort man page for more details.

For more information on the Fortran compiler, visit the following sites:

PGI compilers

The latest version of Portland Group compilers are located in the directory:

In order to use the compiler, users have to source shell script pgi.sh (bash)/pgi.csh (tcsh) found in the etc directory under the path given above, to defines paths and some other environment variables:

The Fortran compilers are invoked as pgf77 (F77), pgf90 (F90), and pgf95 (F95).

To find the compiler version, use flag -V, e.g. pgf90 -V. For list of available flags, use the man pages (e.g. man pgf90).

In order to obtain good performance we generally recommend the flag -fastsse.

For more information on the Fortran compiler(s), PGI Fortran Reference and other PGI tools, please have a look at:

The Portland Group advises on its website the compiler options for a few well-known codes (ATLAS, FFTW, GotoBLAS,..)

Parallel application development

MPI - Message Passing Interface

As Kingspeak is a distributed memory parallel system, message passing is the way to communicate between the processes in the parallel program. The Message Passing Interface (MPI) is the prevalent communication system and the preferred mode of parallel programming on Kingspeak. Our recommendation is that your first choice be the MVAPICH2.  You must use the version of the MPI built with the same compiler as you are using to build your code.

OpenMPI on Kingspeak

We have installed the OpenMPI library which is an open source MPI-2 implementation. This MPI version is installed under the directory:

Gnu:

Intel:

PGI:

The subdirectories bin, lib, and include contain executables (compilation scripts, etc), the static and dynamic libraries, and the header files, respectively.

In order to compile the MPI code with OpenMPI or to run your code parallel with OpenMPI, you first need to source the file openmpi.sh (bash) or openmpi.csh (tcsh) found in the etc folder found in the directories listed above, in order to set the necessary environmental variables.

Intel:

Gnu:

Your source code can be compiled using OpenMPI on Kingspeak in the following way (dependent on the language to be used):

To run your parallel job on Kingspeak using OpenMPI, use the default mpirun command, and, also, specify the number of processors ($PROCS) and the host file:

For more details on these and other options to the mpirun command, see its man page or run mpirun --help.

Kingspeak nodes have 16 physical cores divided into 2 sockets (8 cores per socket), with the NUMA (non-uniform memory access) architecture. Therefore it is important to correctly place processes onto the cores and make sure they don't move around during the runtime. OpenMPI's mpirun command has options for process distribution and process pinning (fixing process to a given physical processor). It is highly recommended to use the pinning. For a program that spans all 16 cores in each node, use by core pinning and by core process distribution, as:

In case of OpenMP/MPI mixed codes, we have found that it is optimal to run one MPI process per socket, that is, two MPI processes per node, with each process having 8 OpenMP threads. In this case, we want to distribute the MPI processes by socket, and, bind them to socket, as:

Notice that in this command line we specify OMP_NUM_THREADS as an environmental variable passed through the mpirun command to all the MPI processes.

For optimal OpenMP performance, it is also recommended to pin the OpenMP threads to a processor core. This is achieved with Intel compiler environment variable KMP_AFFINITY, as additional mpirun parameter

.

For more details on the OpenMP thread affinity (which also explains the KMP_AFFINITY parameters shown above), see http://software.intel.com/sites/products/documentation/hpc/composerxe/en-us/cpp/lin/optaps/common/optaps_openmp_thread_affinity.htm

MVAPICH2 on Kingspeak

MVAPICH2 is open-source MPI software which exploits the novel features in InfiniBand. We have built MVAPICH2 on Kingspeak with the all three compilers. This MPI version is installed under the directory:

Gnu:

Intel:

PGI:

The subdirectories bin, lib, and include contain executables (compilation scripts,etc), the static and dynamic libraries, and the header files, respectively.

In order to compile the MPI code with MVAPICH2 or to run your code parallel with MVAPICH2, you first need to source the file mvapich2.sh (bash) or mvapich2.csh (tcsh) found in the etc directory in the above listed directories in order to set the necessary environmental variables.

Gnu:

Intel:

PGI:

Your source code can be compiled using MVAPICH2 on Kingspeak in the following way (dependent on the language to be used):

To run your parallel job on Kingspeak with MVAPICH2, use the default mpirun command, and, also, specify the number of processors ($PROCS) and the host file in the following way:

For more details on these and other options to the mpirun command, see its man page or run mpirun --help.

Debugging and Profiling

The Data Display Debugger (ddd) is a graphical interface which supports multiple debuggers, including the standard GNU debugger, gdb. With ddd one can attach to running processes, set conditional break points, manipulate the data of executing processes, view source code, assembly code, registers, threads, and signal states. Man pages for both ddd and gdb are available. For more information visit the following URL:

http://www.gnu.org/software/ddd/

In addition the Portland Group includes a debugger, i.e. pgdbg in their PGI Compiler suite.

Totalview, a de-facto industry standard debugger supports both serial and parallel debugging. For details on how to use Totalview, have a look at the CHPC Totalview presentation.

For serial profiling, there is GNU gprof and the Portland Group pgprof. Intel provides the VTune Amplifier XE to test the code performance for users developing serial and multi-threaded applications. Intel also provides the Intel Inspector which has been developed to find and fix memory errors and thread errors in serial or parallel code.

Libraries

Linear algebra subroutines

There are several BLASLAPACK library versions available on Kingspeak. We recommend using the Intel Math Kernel Library (MKL) since it is optimized for the Intel processors.

We recommend the OpenBLAS libraries when using the GNU compilers. The Portland Group has the ACML (AMD Core Math Library) library within their distribution.

Intel Math Kernel Library (MKL)

MKL is best suited for Intel based processors. Thus, on Kingspeak, we recommend using MKL for BLAS and LAPACK. We also recommend using Intel Fortran and C/C++ for best performance.

MKL contains highly optimized math routines. It includes full optimized BLAS, LAPACK, sparse solvers, vector math library, random number generators and and fast Fourier transform routines (including FFTW wrappers). For more information, consult the Intel Math Kernel Library Documentation. Its latest version is located at:

Compilation instructions:

The examples below (diagonalization of a symmetric matrix) require the source files lapack1.f90 and lapack1.c

Intel Fortran (using dynamic linking )
Intel C/C++ (using dynamic linking)

If you use the C++ compiler, please replace icc by icpc and change the suffix .c into .cc in the previous statement.

It is also possible to incorporate OpenMP-threaded MKL into an OpenMP or mixed MPI/OpenMP code. To do so, parallelize your code with OpenMP but leave the MKL calls unthreaded, and instead link the threaded MKL library as e.g.:

Then run as you usually would with given OMP_NUM_THREADS and MKL calls will run over that many threads as well.

For more MKL linking option see this document:http://software.intel.com/sites/products/documentation/hpc/mkl/lin/MKL_UG_linking_your_application/Linking_Examples.htm

OpenBLAS

OpenBLAS is an optimized BLAS version based on the deceased GotoBLAS2 1.13 BSD library. The OpenBlas library (libopenblas.a/libopenblas.so) contains besides the complete BLAS and LAPACK libraries the CBLAS and LAPACKE libraries (C-Interfaces to Fortran libraries)

The OpenBLAS library optimized for Kingspeak is located at:

Fortran

GNU Fortran:

C/C++

GNU C:

ACML:

The Portland Group has included the ACML library within its own compiler suite. The ACML library

PGI Fortran

PGI C/C++

 

 

  • No labels