Quick start

Dependencies

A C99 compiler (GCC, clang, icc). OpenMP can be used if supported by the compiler.
If MPI is needed, an MPI implementation, i.e., OpenMPI or MPICH, for MPI support. Use a CUDA-aware MPI implementation for multi-GPU support.
If GPU acceleration is needed, use CUDA 11.x and the nvcc compiler to use CUDA GPU acceleration.
Perl 5.x for compilation.
ninja build for compilation.

Compilation

Clone the directory

git clone https://github.com/claudiopica/HiRep

Make sure the build command Make/nj and ninja are in your PATH.

Adjust compilation options

Adjust the file Make/MkFlags to set the desired options. The option file can be generated by using the Make/write_mkflags.pl tools. Use

write_mkflags.pl -h

for a list of available options. The most important ones include:

Number of colors (NG)

NG=3

Gauge group SU(NG) or SO(NG)

GAUGE_GROUP = GAUGE_SUN

#GAUGE_GROUP = GAUGE_SON

Representation of fermion fields

REPR = REPR_FUNDAMENTAL
#REPR = REPR_SYMMETRIC
#REPR = REPR_ANTISYMMETRIC
#REPR = REPR_ADJOINT

Lattice boundary conditions

Comment out the line here when you want to establish certain boundary conditions in the respective direction.

Available options are

BC_<DIR>PERIODIC, for periodic boundary conditions
BC<DIR>ANTIPERIODIC, for antiperiodic boundary conditions
BC<DIR>THETA associates a twisting angle to the fermionic field in the specified direction <DIR>. The concrete angle has to be specified in the input file.
BC<DIR>_OPEN, for open boundary conditions. Open boundary conditions can only be set in the T direction.

Example for antiperiodic boundary conditions in the time direction and periodic boundary conditions in the spatial dimensions.

MACRO += BC_T_ANTIPERIODIC
MACRO += BC_X_PERIODIC
MACRO += BC_Y_PERIODIC
MACRO += BC_Z_PERIODIC

Parallelization

You can select a number of features via the MACRO variable. The most important ones are:

Specify whether you want to compile with MPI by using

MACRO += WITH_MPI

For compilation with GPU acceleration for CUDA GPUs enable GPU use and use the new geometry. If you try to compile with GPUs but forget to set the new geometry, the compilation will fail.

MACRO += WITH_GPU

MACRO += WITH_NEW_GEOMETRY

If you want to compile your code for AMD GPUs, additionally add the flag

MACRO += WITH_GPU
MACRO += WITH_NEW_GEOMETRY
MACRO += HIP

Other standard options

MACRO += UPDATE_EO

enables even-odd preconditioning, so you never want to disable it.

MACRO += NDEBUG

suppresses debug output. If you delete this option, HiRep will print a lot more unnecessary output.

MACRO += CHECK_SPINOR_MATCHING

This performs a check on the geometries of the spinors and is essential for debugging. In general, leaving it as a safety check does not hurt, but if you simulate with very small local lattices, you may want to disable it and check whether there is a performance improvement.

MACRO += IO_FLUSH

Prints to file immediately. If the simulation or analysis prints an unusual amount of data, it may affect performance.

Compiler options

To compile the code for your laptop, you only need to set the C compiler. For example

CC = gcc
CFLAGS = -Wall -O3
INCLUDE = 
LDFLAGS =

If you want support for parallelization, you need to include the MPI compiler wrapper

CC = gcc
MPICC = mpicc
CFLAGS = -Wall -O3
GPUFLAGS =
INCLUDE =
LDFLAGS =

Another example: To use the Intel compiler and Intel's MPI implementation, and no CUDA, one could use:

CC = icc
MPICC = mpiicc
LDFLAGS = -O3
INCLUDE =

With a single NVIDIA GPU and without MPI:

CC = gcc
NVCC = nvcc
CXX = g++
LDFLAGS = -Wall -O3
GPUFLAGS = 
INCLUDE =

Note that this compiles a fat binary but you can also specify a target architecture under the GPUFLAGS.

For a single AMD GPU nvcc needs to be replaced by hipcc. For LUMI, the standard C and C++ compilers are cc and CC.

CC = cc
NVCC = hipcc
CXX = CC
LDFLAGS = -Wall -O3
GPUFLAGS = 
INCLUDE =

Multi-GPU simulations on NVIDIA GPUs: you can set your choice of C, C++, MPI, and CUDA compiler and their options by using the variables:

CC = gcc
MPICC = mpicc
NVCC = nvcc
CXX = g++
LDFLAGS = -Wall -O3
GPUFLAGS =
INCLUDE = 

For LUMI AMD Multi-GPU jobs, it seems to be favorable to use hipcc instead of CC.

ENV = MPICH_CC=hipcc
CC = gcc
MPICC = cc
CFLAGS = -Wall -O3
NVCC = mpicc
GPUFLAGS = -w --offload-arch=gfx90a 
INCLUDE =
LDFLAGS = --offload-arch=gfx90a

For more information on configuring the code for AMD GPUs, see the user guide on the GitHub pages.

Compile the code

From the root folder just type:

nj

(this is a tool in the Make/ folder: make sure it is in your path!) The above will compile the libhr.a library and all the available executables in the HiRep distribution, including executables for dynamical fermions hmc and pure gauge suN simulations and all the applicable tests. If you wish to compile only one of the executables, e.g., suN, just change to the corresponding directory, e.g., PureGauge, and execute the nj command from there.

All build artefacts, except the final executables, are located in the build folder at the root directory of the distribution.

Run

Adjust input file

As an example, we will use the hmc program, which can be found in the HMC directory (to create the executable type nj in that directory). The hmc program will generate lattice configurations with dynamical fermions using a hybrid Monte Carlo algorithm. The program uses a number of parameters that need to be specified in an input file; see HMC/input_file for an example. Input parameters are divided into different sections, such as global lattice size, number of MPI processes per direction, random number generator, run control variables, and definition of the lattice action to use for the run. For example, for basic run control variables, one can have a look at the section Run control variables.

run name = run1
save freq = 1
meas freq = 1
conf dir = cnfg
gauge start = random 
last conf = +1

The "+" in front of last conf specifies the number of additional trajectories to be generated after the chosen startup configuration. I.e., if the startup configuration is trajectory number 5 and last conf = 6, then one additional trajectory will be generated, while if last conf = +6, then six additional trajectories will be generated (i.e., the last configuration generated will be number 11).

Execute Binary

When not using MPI, simply run:

./hmc -i input_file

where hmc is the binary generated from hmc.c. If you are using OpenMP, remember to set OMP_NUM_THREADS and other relevant environment variables to the desired value.

For the MPI version, run

mpirun -np <number of MPI processes> ./hmc -i input_file

or follow the instructions for submitting your script to Slurm. See examples for submit scripts in the documentation.

The GPU version of the code uses 1 GPU per MPI process.

Only the MPI process rank 0 writes the output file, which is by default in a file called out_0 in the current directory. The -o option allows you to set a different name for the output file.

It is sometimes helpful to have output files from all MPI processes for debugging purposes. This can be enabled with the compilation option:

MACRO += LOG_ALLPIDS

Table of Contents