Coder Social home page Coder Social logo

stackinator's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

stackinator's Issues

Module generation fails due to Bootstrap compiler

The module generation step fails when the version of the bootstrap compiler does not match one of the gcc versions - almost certainly a bug in Spack's module generation code.

> spack -C /user-environment/config module tcl refresh --upstream-modules --delete-tree --yes-to-all
==> Regenerating tcl module files
==> Warning: Could not write module file [/user-environment/modules/gcc/12.2.0]
==> Warning: 	--> No compilers for operating system sles15 satisfy spec [email protected] <--

In this case generation of the module file for the gcc 12 that was specified in the gcc environment fails because the gcc 11 bootstrap compiler can't be found.
Manually adding the bootstrap compiler to store/config/compilers.yaml fixes the problem, however this would expose the bootstrap compiler to downstream spack installations.

Possible solution - generate a temporary compilers.yaml with the bootstrap compiler for module generation.

nvhpc issue with intel-oneapi-tbb

Desired functionality: intel-oneapi-mkl is installed in spack stack and available for any programming environment

Fisrt attpempt: install intel-oneapi-mkl only in gcc-env

Result: squashfs in built, but when I build QE with nvhpc, the intel-one-api is fetched again and installed for %nvhpc

Second attempt: add intel-oneapi-mkl to nvhpc-env

    nvhpc-env:
      compiler:
          - toolchain: llvm
            spec: [email protected]
          - toolchain: gcc
            spec: gcc@11
      unify: true
      specs:
      - libxc%nvhpc
      - [email protected]%gcc
      - intel-oneapi-mkl+cluster%nvhpc

Result: broken installation of dependency intel-oneapi-tbb which is not reproducible in a standalone container with clean spack and nvhpc installations.

[+] /user-environment/linux-sles15-zen3/gcc-11.3.0/pkgconf-1.8.0-e3ts5emzovmwgyiq262u76slp62yevny
spack -e '/dev/shm/pe-dft/packages/gcc-env' install   --only-concrete --only=package --no-add /6fufv6rzstbwbvhe6hie4j3jvb5ai22d # [email protected]%[email protected]+envmods build_system=generic arch=linux-sles15-zen3
==> Installing intel-oneapi-tbb-2021.8.0-6fufv6rzstbwbvhe6hie4j3jvb5ai22d
==> No binary for intel-oneapi-tbb-2021.8.0-6fufv6rzstbwbvhe6hie4j3jvb5ai22d found: installing from source
==> Fetching https://registrationcenter-download.intel.com/akdlm/irc_nas/19143/l_tbb_oneapi_p_2021.8.0.25334_offline.sh
==> No patches needed for intel-oneapi-tbb
==> intel-oneapi-tbb: Executing phase: 'install'
==> Error: ProcessError: Command exited with status 1:
    'bash' 'l_tbb_oneapi_p_2021.8.0.25334_offline.sh' '-s' '-a' '-s' '--action' 'install' '--eula' 'accept' '--install-dir' '/user-environment/linux-sles15-zen3/gcc-11.3.0/intel-oneapi-tbb-2021.8.0-6fufv6rzstbwbvhe6hie4j3jvb5ai22d'

1 error found in build log:
     5     Wait while the installer is preparing...
     6     Done.
     7     Launching the installer...
     8     Start installation flow...
     9     Installation of component has failed.
     10    Component id: intel.oneapi.lin.oneapi-common.vars, name: oneAPI Common, version: 2023.0.0-25325.
  >> 11    Error: Sequence execution failed.
     12
     13    An error has been encountered during the installation process. Detailed installation log files are located under '/tmp/unknown_user/intel_oneapi_installer/2023.02.03.10.16.28.
           127'.
     14    Please submit this error and the log files using one of the following support options:
     15     - Report your issue on the Intel Community Forum - https://community.intel.com/t5/Intel-oneAPI-Registration/bd-p/registration-download-licensing-instal
     16     - If you have Priority Support, submit a Service request at Online Service Center - https://supporttickets.intel.com/servicecenter?lang=en-US
     17    Installer completed with code 1

See build log for details:
  /tmp/antonk/spack-stage/spack-stage-intel-oneapi-tbb-2021.8.0-6fufv6rzstbwbvhe6hie4j3jvb5ai22d/spack-build-out.txt

==> Error: intel-oneapi-tbb-2021.8.0-6fufv6rzstbwbvhe6hie4j3jvb5ai22d: Package was not installed
==> Error: Installation request failed.  Refer to reported errors for failing package(s).
make[1]: *** [gcc-env/Makefile:23: gcc-env/generated/install/intel-oneapi-tbb-2021.8.0-6fufv6rzstbwbvhe6hie4j3jvb5ai22d] Error 1
make[1]: *** Waiting for unfinished jobs....
spack -e '/dev/shm/pe-dft/packages/nvhpc-env' install   --only-concrete --only=package --no-add /lmdnphjb3zjaouhaxy2zy6nhmtkmrjnm # [email protected]%[email protected]+envmods build_system=generic arch=linux-sles15-zen3

Third attempt: add - intel-oneapi-mkl+cluster%gcc to nvhpc-env

Result:

==> Error: Name clashes detected in module files:

file: /user-environment/modules/intel-oneapi-mkl/2023.0.0-gcc
spec: [email protected]%[email protected]+cluster+envmods~ilp64+shared build_system=generic arch=linux-sles15-zen3
spec: [email protected]%[email protected]+cluster+envmods~ilp64+shared build_system=generic arch=linux-sles15-zen3

The projecion on modules.yaml has intel-oneapi-mkl: '{name}/{version}-{compiler.name}'. Removing projection leads to the same error, which is strage as it works for cuda.

Latest update: it was possible to build QE with the stack but intel-oneapi-mkl was installed for nvhpc env. at the user's space.

Add schema for YAML

Add schema for the config.yaml, packages.yaml and compilers.yaml files in recipes.

We hardly check or validate the input in these files, and using a schema looks like the best line of defence for validating input and giving early feedback to users.

Exclude large binary packages from build cache

Pushing large binary packages, such as nvhpc, cuda and possible Intel compilers/mkl takes a long time.
This is due to the process of scanning the large package, and the slow single-threaded compression algorithm used by spack.

The time taken negates the benefits of using build caches, as these packages do not require compilation.

New version of spack seems to break the tool

I'm getting this error

spack -e ./gcc buildcache create --rebuild-index --allow-root --only=package -m alpscache \
$(spack --color=never -e ./gcc find --format '{name};{/hash}' | grep -v -E '^();' | cut -d ';' -f2)
==> Error: unrecognized arguments: -m

when trying to build eth-cscs/alps-uenv#10

[dft-dev-pe] Error building the stack

First attmpt to build the compilers.yaml

compilers:
  bootstrap:
    specs:
    - gcc@11
  gcc:
    specs:
    - gcc@11
  llvm:
    requires: gcc@11
    specs:
    - [email protected]
    - llvm@14

and packages.yaml

packages:
    gcc-env:
      compiler:
          toolchain: gcc
          spec: gcc@11
      unify: true
      specs:
      - [email protected]
      - py-mpi4py
      - [email protected]
      - py-numpy
      - py-pybind11
      - py-pip
      - cmake
      - intel-oneapi-mkl
      mpi:
          spec: cray-mpich-binary
          gpu: Null
    nvhpc-env:
      compiler:
          toolchain: gcc
          spec: [email protected]
      unify: true
      specs:
      - libxc
      mpi:
          spec: cray-mpich-binary
          gpu: Null

fails with the error

spack -e nvhpc-env/ concretize -f
==> Error: Detected 1 missing include path(s):
/scratch/e1000/antonk/dft-dev-pe/packages/nvhpc-env/compilers.yaml
Makefile:35: gcc-env/Makefile: No such file or directory
make[1]: *** [../Make.inc:20: nvhpc-env/spack.lock] Error 1
make[1]: *** Waiting for unfinished jobs....

Easy access to the build env

Debugging the build can be challenging, because recreating the build environment (with environment variables erased+set and the various bwrap mounts applied) has to be performed manually in an ad-hoc manner.

Create a script that can be sourced to start a shell with the correct environment, e.g.

source build-env.sh

Feature: add CLI option to override target cluster

Most recipes can be compiled on any target system with the appropriate hardware, however the target system is specified in config.yaml.

Add a --cluster flag to stack-config that allows the target system to be specified at configure time.

stack-config -r $recipe_path -b $build_path --cluster=clariden

Utopia stack with Trilinos+CUDA: cudaErrorUnsupportedPtxVersion

I have built a software stack for Utopia on Clariden using this recipe:
https://github.com/edopao/utopia-recipe/blob/ede5c35792e12c4e8a0c46918846dbc543e5665d/environments.yaml

This recipe enables the CUDA variant on all packages, with cuda_arch=80. Here is the concretisation result for Trilinos:

==> Concretized [email protected]                                                                                                                                              
 -   3xr57ku  [email protected]%[email protected]~allow-unsupported-compilers~dev build_system=generic arch=linux-sles15-zen3

==> Concretized [email protected]+amesos2+belos~epetra+intrepid2+mumps+nox+openmp+shards+suite-sparse+superlu-dist cxxstd=17                                             
 -   o23zzjq  [email protected]%[email protected]~adelus~adios2~amesos+amesos2+anasazi~aztec~basker+belos~boost~chaco~complex+cuda~cuda_rdc~debug~dtk~epetra~epetraext~epetraextbtf~epetraextexperimental~epetraextgraphreorderings~exodus+explicit_template_instantiation~float+fortran~gtest~hdf5~hypre~ifpack+ifpack2~intrepid+intrepid2~ipo~isorropia+kokkos~mesquite~minitensor~ml+mpi+muelu+mumps+nox+openmp~panzer~phalanx~piro~python~rocm~rocm_rdc~rol~rythmos+sacado~scorec+shards+shared~shylu~stk~stokhos~stratimikos~strumpack+suite-sparse~superlu+superlu-dist~teko~tempus~thyra+tpetra~trilinoscouplings~uvm+wrapper~x11~zoltan~zoltan2 build_system=cmake build_type=RelWithDebInfo cuda_arch=80 cxxstd=17 gotype=long_long arch=linux-sles15-zen3

After building Utopia in the above user environment, I get a CUDA runtime error:

terminate called after throwing an instance of 'std::runtime_error'
  what():  cudaDeviceSynchronize() error( cudaErrorUnsupportedPtxVersion): the provided PTX was compiled with an unsupported toolchain. /tmp/epaone/spack-stage/spack-stage-trilinos-13.4.0-o23zzjqfcj6fo55x4rqqvihjdklmo6dv/spack-src/packages/kokkos/core/src/Cuda/Kokkos_Cuda_Instance.cpp:151

Here is the output of ldd command for reference:

$ ldd utopia_test | grep cuda
	libcudart.so.11.0 => /user-environment/linux-sles15-zen3/gcc-11.3.0/cuda-11.8.0-3xr57kuw4q4cw53rscdnvqyjorpqamnp/lib64/libcudart.so.11.0 (0x00007fc4a0630000)
	libnvToolsExt.so.1 => /user-environment/linux-sles15-zen3/gcc-11.3.0/cuda-11.8.0-3xr57kuw4q4cw53rscdnvqyjorpqamnp/lib64/libnvToolsExt.so.1 (0x00007fc4a0426000)
	libcufft.so.10 => /user-environment/linux-sles15-zen3/gcc-11.3.0/cuda-11.8.0-3xr57kuw4q4cw53rscdnvqyjorpqamnp/lib64/libcufft.so.10 (0x00007fc48f54b000)
	libcublas.so.11 => /user-environment/linux-sles15-zen3/gcc-11.3.0/cuda-11.8.0-3xr57kuw4q4cw53rscdnvqyjorpqamnp/lib64/libcublas.so.11 (0x00007fc4898ed000)
	libcusparse.so.11 => /user-environment/linux-sles15-zen3/gcc-11.3.0/cuda-11.8.0-3xr57kuw4q4cw53rscdnvqyjorpqamnp/lib64/libcusparse.so.11 (0x00007fc478bf5000)
	libcusolver.so.11 => /user-environment/linux-sles15-zen3/gcc-11.3.0/cuda-11.8.0-3xr57kuw4q4cw53rscdnvqyjorpqamnp/lib64/libcusolver.so.11 (0x00007fc46693d000)
	libcurand.so.10 => /user-environment/linux-sles15-zen3/gcc-11.3.0/cuda-11.8.0-3xr57kuw4q4cw53rscdnvqyjorpqamnp/lib64/libcurand.so.10 (0x00007fc460061000)
	libcuda.so.1 => /usr/lib64/libcuda.so.1 (0x00007fc45e831000)
	libmpi_gtl_cuda.so.0 => /user-environment/linux-sles15-zen3/gcc-11.3.0/cray-mpich-8.1.24-gcc-fwf2cccra3y3lxkzw7kvqjyvwfipin4i/lib/libmpi_gtl_cuda.so.0 (0x00007fc45b7ba000)
	libcublasLt.so.11 => /user-environment/linux-sles15-zen3/gcc-11.3.0/cuda-11.8.0-3xr57kuw4q4cw53rscdnvqyjorpqamnp/lib64/libcublasLt.so.11 (0x00007fc411be9000)

cray-mpich +cuda fails to build applications for Spack v0.19.0

Building MPI packages with cray-mpich-binary+cuda fails for v0.19.0 and later of Spack.

The mpicc (specifically C language) compiler wrapper fails during the linking stage with the errors like the following:

/usr/bin/ld: /user-environment/linux-sles15-zen2/gcc-11.3.0/cray-mpich-binary-8.1.21.1-gcc-4k7oxj3rl75ztwko5s3lamdttacaue4p/lib/libmpi_gtl_cuda.so: undefined reference to `__gxx_personality_v0'

__gxx_personality is defined in libstdc++.

Two manual fixes have been shown to fix the issue:

  1. Patching the mpicc wrapper to add -lstdc++
  2. Patching libmpi_gtl_cuda.so to resolve its own dependency patchelf --add-needed libstdc++.so libmpi_gtl_cuda.so

Of these, the second is more robust, and can be implemented in cray-mpich-binary/package.py.

Use name `cray-mpich` instead of `cray-mpich-binary`

Some packages and compiler toolchains explicitly check the name of the MPI implementation, e.g. openmpi, mvapich2 and so on.
The cray-mpich-binary name can cause spack to generate errors in these test.

Investigate whether the cray-mpich-binary package can be renamed cray-mpich, which would replace the cray-mpich provided by spack.

Better log file management

Unique log files are generated in the path where sstool is called.
This gets messy quickly.
Decide on

  • a standard location for log files
  • a log file retention policy

Python version and jsonschema

jsonschema has been added as a dependency in #31. When I try to use the latest version of stool on hohgant I get the following error:

Traceback (most recent call last):
  File "/users/rmeli/git/sstool/bin/sstool", line 12, in <module>
    from sstool.main import main
  File "/users/rmeli/git/sstool/sstool/main.py", line 15, in <module>
    import sstool.schema
  File "/users/rmeli/git/sstool/sstool/schema.py", line 2, in <module>
    import jsonschema
  File "/users/rmeli/git/sstool/external/jsonschema/__init__.py", line 13, in <module>
    from jsonschema._format import FormatChecker
  File "/users/rmeli/git/sstool/external/jsonschema/_format.py", line 1
    from __future__ import annotations
    ^
SyntaxError: future feature annotations is not defined

stool runs with
https://github.com/eth-cscs/sstool/blob/7c49c8411bf23e06810d4926025a4bb9f7dd898f/bin/sstool#L1

which on hohgant corresponds to Python 3.6.15 (default, Sep 23 2021, 15:41:43) [GCC] on linux. However, annotations seem to have been released in version 3.7.0b1.


Steps to reproduce on hohgant:

$ /usr/bin/python3
Python 3.6.15 (default, Sep 23 2021, 15:41:43) [GCC] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from __future__ import anotations
  File "<stdin>", line 1
SyntaxError: future feature anotations is not defined
>>>

Disambiguate MPI specification

MPI is specified as follows in packages.yaml:

packages:
    env-name:
      mpi: cray-mpich-binary
      gpu: cuda

This is ambiguous, as it is not clear whether the gpu field applies to MPI, or all packages in the environment.

Proposal: replace with a YAML "object"

packages:
  env-name:
    mpi:
      spec: cray-mpich-binary
      gpu: cuda

This can be extended with other MPI-specific options in the future, if needed.

Ignore build cache during concretization

By default spack looks for compatible packages from a build cache, and will use them when available.

For reproducability, spack should ignore the build cache during concretization, and only use build cache packages when the chosen spec is available in the cache.

The desired behavior is opt-in with the following flag:

   concretizer:
     reuse: false # do not allow the concretizer to behave differently depending on the artifacts available in the cache

Update doc about cache-config.yaml

https://eth-cscs.github.io/stackinator/build-caches/#managing-keys says:

The cache-configuration would look like the following, where we assume that the cache is in $SCRATCH/uenv-cache:

root: $SCRATCH/uenv-cache
key: $HOME/.keys/spack-push-key.gpg

but env [...] make store.squashfs -j32:

/dev/shm/jg/stackinator.git/build/bwrap-mutable-root.sh 
--tmpfs ~ 
--bind build/tmp /tmp 
--bind build/store /user-environment spack gpg trust /users/piccinal/.keys/spack-push-key.gpg

will fail with:

==> Error: Command exited with status 2:
'/usr/bin/gpg2' '--with-colons' '/users/piccinal/.keys/spack-push-key.gpg'
gpg: can't open '/users/piccinal/.keys/spack-push-key.gpg'

A solution is to move the gpg key (and updating cache-config.yaml) out of $HOME.

Improved meta-data in the image

Include more meta data in the image about how the stack was built

  • the recipe
  • information about the node (uname -a, CPU arch, GPU arch, cuda driver, etc)
  • stackinator: version, arguments and log file
  • the target mount point

module file generation peccadillos

Some feedback from @msimberg:

I don't use the modules generated by spack so I thought I would just delete the modules.yaml file in the template directory. This ends up using the default modules.yaml which has

    roots:
      lmod: $spack/share/spack/lmod
      tcl: /user-environment/modules

which means that if one tries to install things on top of the user environment spack will try to write things into /user-environment/modules (which fails because it's read-only).Then I tried overriding modules.yaml (in the recipe directory) with:

modules:
  default:
    enable: []

but that silently went on to generate some makefiles and build a lot of stuff, but I noticed at the end that it didn't actually build everything. Finally, I realized that the reason was this line:
https://github.com/bcumming/sstool/blob/2d0f980838ad22ac1a83bd1bc35ba48643401919/lib/sstool/main.py#L73.
sstool generated a partial set of makefiles, enough to build packages, but not enough to finish.
TL;DR: can this be made to handle the nonexistence of that key more graceful:
https://github.com/bcumming/sstool/blob/2d0f980838ad22ac1a83bd1bc35ba48643401919/lib/sstool/main.py#L73

In the end I ended up making modules.yaml:

modules:
  default:
    enable: []
    # This key has to exist...
    roots:
      tcl: /user-environment/modules

external packages in packages.yaml recipe are not added to the generated spack.yaml

I am trying to create a recipe for the AMD stack on Hohgant. A working version of the generated spack.yaml would be:

spack:                                                                               
  include:
  - compilers.yaml
  - config.yaml
  view: false
  concretizer:
    unify: true
    reuse: false
  specs:
  - kokkos+rocm std=17 amdgpu_target=gfx90a ^[email protected] ^[email protected]
  - kokkos-kernels
  - cray-mpich-binary+rocm ^[email protected]
  - [email protected] ^[email protected]
  - fftw +mpi
  - hdf5 +mpi
  - openblas
  - boost
  packages:
    all:
      variants: std=17 amdgpu_target=gfx90a amdgpu_target_sram_ecc=gfx90a target=zen3
      compiler: [gcc@11]
    mpi:
      require: cray-mpich-binary
    hip:
      buildable: false
      externals:
      - spec: [email protected]
        prefix: /opt/rocm
    rocm-cmake:
      buildable: false
      externals:
      - spec: [email protected]
        prefix: /opt/rocm/
    rocminfo:
      buildable: false                                                               
      externals:                                                                     
      - spec: [email protected]                                                         
        prefix: /opt/rocm/                                                           
    rocprim:                                                                         
      buildable: false                                                               
      externals:                                                                     
      - spec: [email protected]                                                          
        prefix: /opt/rocm/rocprim                                                    
    llvm-amdgpu:                                                                     
      buildable: false                                                               
      externals:                                                                     
      - spec: [email protected]                                                      
        prefix: /opt/rocm                                                            
    hsa-rocr-dev:                                                                    
      buildable: false                                                               
      externals:                                                                     
      - spec: [email protected]                                                     
        prefix: /opt/rocm                                                            

Manually modifying the generated packages/gcc-env/spack.yaml file to the above model I am able to build a working AMD stack and create the squash-fs file. The rest follows the receipe in test/base-amdgpu.

However I was not able to correctly modify the recipe to generate the right content as described above. In this config file I am using external packages and I tried to put them into different positions into the packages.yaml file of the recipe but I never got it correctly and they always got discarded. The file that I ended up creating is the following:

packages:
    gcc-env:
      compiler:
          toolchain: gcc
          spec: gcc@11
      unify: true
      specs:
      - kokkos+rocm std=17 amdgpu_target=gfx90a ^[email protected] ^[email protected]
      - kokkos-kernels
      - cray-mpich-binary+rocm ^[email protected]
      - fftw +mpi
      - hdf5 +mpi
      - openblas
      - boost
      mpi:
        spec: cray-mpich-binary
        gpu: rocm
      hip:
        buildable: false
        externals:
        - spec: [email protected]
          prefix: /opt/rocm
      rocm-cmake:
        buildable: false
        externals:
        - spec: [email protected]
          prefix: /opt/rocm/
      rocminfo:
        buildable: false
        externals:
        - spec: [email protected]
          prefix: /opt/rocm/
      rocprim:
        buildable: false
        externals:                                                           
        - spec: [email protected]                                                
          prefix: /opt/rocm/rocprim                                          
      llvm-amdgpu:                                                           
        buildable: false                                                     
        externals:                                                           
        - spec: [email protected]                                            
          prefix: /opt/rocm                                                  
      hsa-rocr-dev:                                                          
        buildable: false                                                     
        externals:                                                           
        - spec: [email protected]                                           
          prefix: /opt/rocm                                                  
    tools:                                                                   
      compiler:                                                              
          toolchain: gcc                                                     
          spec: gcc@11                                                       
      unify: true                                                            
      specs:                                                                 
      - cmake                                                                
      - [email protected]                                                          
      - py-numpy                                                             

It discarded all the HIP related external packages together with the +rocm specification in the cray-mpich-binary package
even though the following was set: gpu: rocm. Same if you use gpu:cuda.

Where would be the correct place for adding those external packages for a correct generation?

Support for external packages

A proposal to make the following changes:

  • rename the packages.yaml file to environments.yaml
  • optional packages.yaml in the spack format that will be appended to the cluster-specific packages.yaml so that a stack can extend the set of system packages that will be used, and set for downstream spack users.

This will reduce ambiguity of the current naming scheme, and make it easier for spack stack recipe writers to control the use of system packages and ensure that they are visible to downstream consumers.

per-environment variants

Add support for setting a variant that will be applied to every package in an environment, e.g. to set the cuda arch or +mpi.

# example environments.yaml
my-gpu-mpi-env:
  compiler:
  - toolchain: gcc
    spec: gcc@11 
  specs:
  - hdf5
  - osu-micro-benchmarks +cuda
  mpi:
    spec: cray-mpich-binary
    gpu: None
  unify: true
  variants:
    - +mpi
    - cuda_arch=80

The generated spack.yaml for the environment would contain:

  packages:
    all:
      variants:
      - cuda_arch=80
      - +mpi

The result would be hdf5 being built with mpi support and the osu benchmarks with support for A100.

`compilers.yaml` not generated when gcc version is shadowed by system gcc

When the version of gcc in the gcc compiler toolchain matches the system-provided compiler, the compiler is not found when generating compilers.yaml for any spack environment that requests the gcc compiler toolchain.

This is observed on a system with gcc 11.3 in /usr/bin, when [email protected] toolchain is requested to install nvhpc or an environment in environments.yaml.

It looks like the version found in compilers/gcc environment matches the one in config/compilers.yaml, so it is skipped for some reason. This is either a Spack bug, or misuse of Spack by us.

Don't allow build paths in `~`

The build tool rebinds ~ as a tmpfs in order to hide any user-specific Spack configuration that would be present in ~/.spack.

This causes the build step to fail if the build is being performed in a sub-directory of $HOME.

The solution is to check the path X provided in -b X, check that it is not a child of ~.

Add file system views

Add support for file system views, as an alternative interface to modules for users.

Add configure-cache tool to stackinator

Add a command line tool for configuring caches in recipes, based on the tool used in the alps spack stacks:

usage: configure-cache [-h] [-k KEY] [-p PATH] [-d] [--read-only] recipe

Configure a stackinator recipe to enable/disable a build cache.

positional arguments:
  recipe                the path of the recipe to configure

optional arguments:
  -h, --help            show this help message and exit
  -k KEY, --key KEY     path to the gpg key used to sign packages - required
                        to update the cache.
  -p PATH, --path PATH  path to the build cache - if provided without a key
                        the cache will be read only.
  -d, --disable         disable the build cache
  --read-only

To consider:

  • extending the stack-config command line to accept cache configuration options
  • supporting a JSON file format for specifying cache configurations
  • support for creating pushing to a sub-directory in a cache for each mount point

Question: manipulating spack mirrors and other defaults?

Greetings.

We need to always override the spack clone's default mirrors.yaml and bootstrap/spack-install/metadata.yaml to use internal resources.

I've been experimenting with: a) managing my own spack clone, for example, checking out v0.20.0, editing the corresponding files, and committing the changes to a branch, e.g., internal-v0.20.0; (b) bind-mounting the corresponding files over {{build_path}}'s spack in our sandbox; and (c) bind mounting stuff in/etc/spack in the sandbox. All of which feel overly complex; I do not appear to be able to manipulate these defaults with the recipe/system-config templates.

Am I missing a simpler solution for this?

buildcache create fails with Spack develop

Building a stack with Spack develop fails with the following error:

spack -e ./bootstrap buildcache create --rebuild-index --allow-root --only=package -m alpscache $(spack -e ./bootstrap find --format '/${hash}')
==> Warning: Using flags to specify mirrors is deprecated and will be removed in Spack 0.21, use positional arguments instead.
==> Error: unexpected tokens in the spec string
/$qopnyywo6pbzxnuwmhigd4atak36224e /$o4njaxpd2vbtg3xj6pm7riastfp4gdno /$zh6jx2cafu2oik2thrnfvzenymuwkqbf /$vtsmei3mpeijbilx6nrvnlhdhb6xlara /$qgixj7zzmq7zwtexq2nvhwwzgp5qptwn /$r5fc5voqdghbdr2dwrctqx7l2dxrkz2r /$nckz4pa6jigjgieo4s4yillr5a5pj56m /$6cbkl5kgjgoua7lkwm2573iar7zgmnvb /$x2f7nteaybed5tahr7da3sqnf4lmfy65 /$43334xnuikvddwtq4hwjyy4s2jhi2fxc /$j66ksktzvekpmnyetvwbu2xdovwntdmq /$dflv7d5jedfoiyry6k6i2idav6nqpbas /$lysz7xccwfwiuyh74aw4q43q5bgiv5vn /$y4yleakj6ts7eftfvyxknivvs4ycgris /$eoqugy3pfg53qncmi45fhm4lmcisds7y /$ed674flkfaa37bfvi2kygwthjdlxo7kh /$ed2as5owju3xj5ywvutc4oqskohqndzo /$hmssjulvwz3cnex4ig3xxqioz5yqn63s
^^^^ ^                                   ^^   ^^^^ ^                                   ^^   ^^^^ ^                                   ^^   ^^^^ ^                                   ^^   ^^^^ ^                                   ^^   ^^^^ ^                                   ^^   ^^^^ ^                                   ^^   ^^^^ ^                                   ^^   ^^^^ ^                                   ^^   ^^^^ ^                                   ^^   ^^^^ ^                                   ^^   ^^^^ ^                                   ^^   ^^^^ ^                                   ^^   ^^^^ ^                                   ^^   ^^^^ ^                                   ^^   ^^^^ ^                                   ^^   ^^^^ ^                                   ^^   ^^^^ ^                                   ^^
make[1]: *** [Makefile:16: bootstrap/generated/build_cache] Error 3
make[1]: *** Waiting for unfinished jobs....

I tracked the issue to spack/spack#37425.


During the bisection, the following error also creeped up:

/dev/shm/rmeli/test/bwrap-mutable-root.sh --tmpfs ~ --bind /dev/shm/rmeli/test/tmp /tmp --bind /dev/shm/rmeli/test/store /user-environment spack -C /dev/shm/rmeli/test/modules module tcl refresh --upstream-modules --delete-tree --yes-to-all
==> Error: Cannot use invalid module set default.    Valid module set names are []
make: *** [Makefile:50: modules] Error 1

Pypi package

Create a Pypi package.

  • add package configuration to the repo
  • push a test version to pypi
  • add a github actions workflow to automatically push to pypi

[Note] Building GCC from Spack's develop fails

Using the latest version of Spack's develop now fails to build GCC 11 with the following error:

  >> 4652    lto-compress.c:(.text+0x173): undefined reference to `ZSTD_compressBound'
  >> 4653    /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: lto-compress.c:(.text+0x194): undefined reference to `ZSTD_maxCLevel'
  >> 4654    /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: lto-compress.c:(.text+0x1b0): undefined reference to `ZSTD_compress'
  >> 4655    /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: lto-compress.c:(.text+0x1bb): undefined reference to `ZSTD_isError'
  >> 4656    /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: lto-compress.c:(.text+0x231): undefined reference to `ZSTD_maxCLevel'
  >> 4657    /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: lto-compress.c:(.text+0x240): undefined reference to `ZSTD_getErrorName'
     4658    /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: lto-compress.o: in function `lto_end_uncompression(lto_compression_stream*, lto_compression)':
  >> 4659    lto-compress.c:(.text+0x4ff): undefined reference to `ZSTD_getFrameContentSize'
  >> 4660    /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: lto-compress.c:(.text+0x532): undefined reference to `ZSTD_decompress'
  >> 4661    /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: lto-compress.c:(.text+0x53d): undefined reference to `ZSTD_isError'
  >> 4662    /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: lto-compress.c:(.text+0x5a0): undefined reference to `ZSTD_getErrorName'
  >> 4663    collect2: error: ld returned 1 exit status
  >> 4664    make[4]: *** [/tmp/rmeli/spack-stage/spack-stage-gcc-11.3.0-6lzw62v4qr6t3mlie3yeqa4l5dsye6lj/spack-src/gcc/cp/Make-lang.in:136: cc1plus] Error 1

This is likely related to the latest version of zstd added in spack/spack#35438.

Note

  • This is observed when building the bootstrap compiler.

Improved error messages

  1. Fix logger to print clearer error messages in general
  • clear, concise error message to console
  • stack trace to log file
  1. Additional layer of error handling for yaml parsing to provide short messages that show the incorrect input field

always generate modules when a `modules.yaml` file is provided in a recipe

The current workflow for building a stack with module files requires calling make twice

# build the compilers, packages, configuration and finally the module files in store
make modules
# compress the contents of store
make store.squashfs

If make store.squashfs is called directly, the module stage is skipped.

If a recipe explicitly provides a modules.yaml file, the modules should be built by default.

Error "Unrecognized xattr prefix lustre.lov" during squashfs image creation process

This message appears thousand times during the squashfs image creation process. It is not fatal and it looks like a warning, since the resulting image works nicely.

Not sure what would be useful to be reported, but in case I think it is easily reproducible on my side. This happened while creating an image that uses ROCm library provided by the system in /opt/rocm-5.2.4.

Also @boeschf addressed the same thing with a similar recipe.

`cray-mpich-binary` currently needs to be added manually as a package to spack instance depending on spack stack as upstream

Currently, trying to use cray-mpich-binary in a spec in a spack instance using the spack stack as an upstream will fail without manual changes because while cray-mpich-binary is installed, its package definition is not available in the upstream.

This can currently be worked around by manually copying the cray-mpich-binary package definition to the dependent spack instance.

I think this would be most cleanly fixed by creating a custom spack repository somewhere under /user-environment and adding the following to /user-environment/config/repos.yaml:

repos:
- /user-environment/path/to/cray-mpich-binary/repo

The package definition should then automatically be picked up if SPACK_SYSTEM_CONFIG_PATH is set to point to /user-environment/config.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.