hpctoolkit

[ Home | Overview | Publications | Software/Downloads ] • [ Documentation/Questions | Training Videos and Slides ] • [ People | Acks ]


Current Spack Issues for HPCToolkit

1 Introduction

Spack is a moving target and receives multiple commits per day. Normally, HPCToolkit will build and run successfully with the latest version of all of its prerequisite packages, but sometimes not. This page covers the current known issues where HPCToolkit fails to build with the latest version of spack. The main build directions are at:

software-instructions.html

Report problems to hpctoolkit-forum at rice dot edu. But before reporting a problem, first try the versions recommended in the packages.yaml file in the spack subdirectory of the hpctoolkit repository. And always check the latest version of this file on the hpctoolkit web site.

http://hpctoolkit.org/spack-issues.html

Last revised: September 4, 2023.

2 Current Issues

2.1 (2023-09-01) Configure fails with liblzma missing -fPIC

Hpctoolkit requires the xz (lzma) library be built with +pic, but spack external find currently does not detect if liblzma.a was built with -fPIC or not. As a result, if you use an external spec for xz, spack thinks it satisfies +pic when it does not and the build fails with:

1 error found in build log:
     181    checking for lzma... /usr
     182    configure: lzma.a static archive: no
     183    configure: lzma.a compiled with -fPIC: no
  >> 184    configure: error: liblzma.a must be compiled with -fPIC

Workaround: Either remove the external spec for xz or edit the entry to add ~pic. That will prevent spack from using the external version. For example:

packages:
  xz:
    externals:
    - spec: xz@5.2.4 ~pic
      prefix: /usr

Note: In general, we recommend using spack external find only for the build utilities (cmake, perl, python) or the major MPI or GPU modules. For hpctoolkit’s dependency libraries (dyninst, elfutils, boost, etc) and their dependencies, it is best to build their latest version from source.

2.2 (2023-01-15) Problems with Python 3.6

Python 3.6 is now deprecated for running Spack and support for 3.6, 3.7 and earlier will be removed in the next few months.

Two problems have been reported with Python 3.6 and earlier. First, there are problems with the lustre file system that result in permission denied errors. Second, sometimes there are problems fetching files with urllib.

Solution Both of these problems can be solved by using a later version of Python. Since support for 3.7 and earlier will be removed in a few months, the smart thing to do is to upgrade to Python 3.8 or later now. If a later version is available but not first in your path or under a different name, then one solution is to set the environment variable SPACK_PYTHON to the preferred path. For example,

export SPACK_PYTHON=/usr/bin/python3.8

3 Recently Resolved Issues

3.1 (2022-09-23) Hpcprof-mpi build fails

The 2022.10.01 release has hpcprof-mpi disabled due to unresolved errors, but these are now fixed in the latest develop and 2023 release.

On all systems except very old Cray, you should use the +mpi option (not +cray) with a spack externals entry for the appropriate module or install prefix, or else build the MPI package if necessary.

On a reasonably new Cray, build with +mpi and always the cray-mpich module. Only on a very old Cray try the +cray option. If there is no cray-mpich module, then build with +mpi as for a regular Linux machine.

On any Cray, always configure for the front end (os=fe), and you will likely need to add extra modules to the compiler entry. If you’re unable to find a set of modules that work, as a last resort switch to the PrgEnv-gnu module and build with --dirty.

See: software-instructions.html#Cray

3.2 (2021-07-11) HPCToolkit build fails in hpcrun-fmt.h

Very rarely, the hpctoolkit build fails in hpcrun-fmt.c or hpcrun-fmt.h with a spew of messages similar to the following.

In file included from hpcrun-fmt.c:84:
hpcrun-fmt.h:1:1: error: expected identifier or '(' before string constant
 "$Id$\n"
 ^~~~~~~~
hpcrun-fmt.c: In function 'hpcrun_fmt_hdr_fwrite':
hpcrun-fmt.c:131:10: error: 'HPCRUN_FMT_Magic' undeclared (first use in this function)
   fwrite(HPCRUN_FMT_Magic,   1, HPCRUN_FMT_MagicLen, fs);

The problem is that due to quirks in some file system timestamps, the hpcrun-fmt.h file is being overwritten with a corrupt copy.

Fixed This problem was fixed both in the hpctoolkit repository and in the Spack recipe as of January 17, 2023, with patches for older versions.

If you see this bug in a Spack build, then update your version of Spack to something after this date. If you see this bug in an autotools build, then either update your copy of hpctoolkit, or else manually move the file src/lib/prof-lean/hpcrun-fmt.txt to a name that doesn’t end in .txt and use git to restore the original hpcrun-fmt.h file.

4 General Problems

These are general problems that arise from time to time.

4.1 Unable to fetch

Sometimes spack fails to download the source file(s) for some package and dies with a message similar to this.

==> Fetching from https://ftpmirror.gnu.org/m4/m4-1.4.18.tar.gz failed.
==> Error: FetchError: All fetchers failed for m4-1.4.18-vorbvkcjfac43b7vuswsvnm6xe7w7or5

Now that most tar files are available from AWS, this problem has become very rare. When it does still happen, the problem is usually temporary and the solution is to either wait a few minutes or an hour and try again, or else download the file manually and put it into a spack mirror.

Workaround: Spack’s default method of fetching is using internal Python libraries (urllib). Sometimes, depending on version, this may have trouble. In that case, you could try resetting this to the external curl program in config.yaml.

config:
  url_fetch_method: curl

4.2 Connection timeout

Another way fetch can fail is with a connection timeout. Some sites, especially sourceforge are often slow to connect. If this happens, then increase the connection timeout in config.yaml to 30 or 60 seconds (default is 10 seconds).

config:
  connect_timeout: 60

4.3 New version breaks the build

Sometimes the latest version of some package breaks the build. This has happened a couple of times where a new version of Boost has broken the build for Dyninst. The solution is to revert the package to an earlier version until the rest of the code catches up.

4.4 Spack core breaks the build

Sometimes but rarely, something in the spack core will change or break the code in some package.py file. The solution is to look through the spack git log and revert the repository to a recent commit before the breakage.

5 Issues with specific versions

5.1 Binutils 2.35

Avoid binutils versions 2.35 and 2.35.1, they contain a bug that causes hpcprof to spew BFD Dwarf errors about “could not find variable specification at offset xxxx.” This is fixed in release 2.35.2 or 2.36 or later.

5.2 Boost 1.68.0

Avoid boost version 1.68.0, it breaks the build for hpctoolkit.

5.3 Elfutils 0.176

Beginning with 0.176, elfutils requires glibc 2.16 or later and won’t work with an older glibc, including RedHat or CentOS 6.x and Blue Gene.



©2002-2023 Rice UniversityRice Computer Science