Licensing

Licensing

Design Point and Parameter Point subtask timeout when using SLURM When updating Design Points or Parameter Points on a Linux system running a SLURM scheduler. The RSM log file shows the following warnings and errors, DPs 5 – SubTask – srun: Job 3597 step creation temporarily disabled, retrying (Requested nodes are busy) [WARN] RSM subtask for DP 4 has not started as 5 minutes have passed

    • SolutionSolution
      Participant

      Due to a change at SLURM version 20.11. By default SLURM systems now only allow one srun process to be active on each compute node. This can result in RSM subtasks timing out. If the solution phase of a calculation, takes longer than 5 minutes to complete. The workaround is to add the –overlap argument to the SLURM srun command. Which can be implemented by editing the file, {installdir}/v212/RSM/Config/xml/hpc_commands_SLURM.xml and replacing the lines, srun –ntasks=1 with the lines srun –overlap –ntasks=1

      Attachments:
      1. 2066933.zip