-
-
November 20, 2021 at 1:04 pmKartiSinghFreemanSubscriber
Hey,
I'm currently submitting jobs with PBS Pro on the university HPC but I am struggling to get multi-node simulations to work with utilisation of more than one core. I've tried submissions through workbench, and the mechanical solver directly but neither appears to use more than one core.
Currently my submission script looks like this
#!/bin/bash -l
#PBS -N Test
#PBS -l select=4:ncpus=3:mpiprocs=3:mem=10gb
#PBS -l walltime=2:00:00
cd $PBS_O_WORKDIR
module load intel
module load ansys/21.1
/pkg/suse12/software/ANSYS/21.1/v211/ansys/bin/ansysdis211 -i ds.dat -o solve.out -b -dis -mpi intelmpi -np 10
I've also provided a screencap of the command line options within Workbench's settings for Mechanical APDL.
November 23, 2021 at 4:26 pmmrifeAnsys EmployeeHi @KartiSinghFreemanthat MAPDL Options shown is for the MAPDL Component System in Workbench and not WB Mechanical. What does the solve process look like for the PBS queue in WB Mechanical?
Mike
November 23, 2021 at 4:30 pmKartiSinghFreemanSubscriberthanks that makes sense now.
I managed to get workbench working with the script below but now my issue is that I cannot solve with more than two processors. If I go into the GUI and individually load each component locally I can solve with more processors, but on the HPC with submission it defaults to two. I have no idea how to increase the processor count.
#!/bin/bash -l
#PBS -N Test
#PBS -l select=10:ncpus=1:mpiprocs=1:mem=10gb
#PBS -l walltime=00:30:00
#PBS -p 1023
cd $PBS_O_WORKDIR
module load ansys/20.1
export I_MPI_HYDRA_BOOTSTRAP=ssh; export KMP_AFFINITY=balanced
runwb2 -B -E "Update();Save(Overwrite=True)" -F Test.wbpj
November 23, 2021 at 5:31 pmmrifeAnsys Employeeis there a cluster admin you can ask? I think the ncpus and mpiprocs being 1 is the answer. Normally we connect to PBS via RSM and RSM knows how to submit the correct PBS command to launch the job. I think it first requests of PBS the compute node list and number of cpu cores on each to use, given the total number of cpu cores you requested to solve on. I think these are stored as variables and used in the select/ncpus/mpiprocs line.
Mike
November 24, 2021 at 3:49 amKartiSinghFreemanSubscriberIt's definitely something related to the cluster as these are the two errors I get before it defaults into 2 cores
--------------------- Error 1
#!/bin/sh
echo job started on $(hostname)
# check shared cluster directory
ClusterSharedDirectory="/home/Test/_ProjectScratch/Scr7200/"
[ -d "$ClusterSharedDirectory" ] || { echo "Shared cluster directory does not exist on execution host, make sure it is mounted, shared out and can be accessed from all nodes."; echo 1008 > "/home/Test/_ProjectScratch/Scr7200/exitcode_6aa0b86a-5272-463b-8291-ce057e10320e.rsmout"; exit 1008; }
# check AWP_ROOT
echo AWP_ROOT201=$AWP_ROOT201
[ -z "$AWP_ROOT201" ] && echo "AWP_ROOT201 is not set on execution host" && echo 1000 > "/home/Test/_ProjectScratch/Scr7200/exitcode_6aa0b86a-5272-463b-8291-ce057e10320e.rsmout" && exit 1000
[ -d "$AWP_ROOT201" ] || { echo "AWP_ROOT201 directory does not exist on execution host"; echo 1009 > "/home/Test/_ProjectScratch/Scr7200/exitcode_6aa0b86a-5272-463b-8291-ce057e10320e.rsmout"; exit 1009; }
# check command can be found
command="$AWP_ROOT201/commonfiles/CPython/3_7/linx64/Release/python/runpython"
[ -f "$command" ] || { echo "$command not found"; echo 1007 > "/home/Test/_ProjectScratch/Scr7200/exitcode_6aa0b86a-5272-463b-8291-ce057e10320e.rsmout"; exit 1007; }
# running the cluster commmand
echo command: "$AWP_ROOT201/commonfiles/CPython/3_7/linx64/Release/python/runpython" -B -E "$AWP_ROOT201/RSM/Config/scripts/ClusterJobs.py" "/home/Test/_ProjectScratch/Scr7200/control_6aa0b86a-5272-463b-8291-ce057e10320e.rsm"
"$AWP_ROOT201/commonfiles/CPython/3_7/linx64/Release/python/runpython" -B -E "$AWP_ROOT201/RSM/Config/scripts/ClusterJobs.py" "/home/Test/_ProjectScratch/Scr7200/control_6aa0b86a-5272-463b-8291-ce057e10320e.rsm"
--------------------- Error 2
ARC
cl4n007
$AWP_ROOT201/SEC/SolverExecutionController/runsec.sh
null
done
NOSCRATCH
NOUNC
/home/Test/_ProjectScratch/Scr7200/
true
SSH
NOLIVELOGFILE
stdout_6aa0b86a-5272-463b-8291-ce057e10320e.live
stderr_6aa0b86a-5272-463b-8291-ce057e10320e.live
*.dat
file*.*
*.mac
thermal.build
commands.xml
SecInput.txt
done
file.abt
sec.interrupt
done
*.xml
*.NR*
*.swf
CAERepOutput.xml
Load_*.inp
Mode_mapping_*.txt
NotSupportedElems.dat
ObjectiveHistory.out
PostImage*.png
cyclic_map.json
exit.topo
file*.dsub
file*.ldhi
file*.mntr
file*.png
file*.r0*
file*.r1*
file*.r2*
file*.r3*
file*.r4*
file*.r5*
file*.r6*
file*.r7*
file*.r8*
file*.r9*
file*.rd*
file*.rst
file.BCS
file.DSP
file.PCS
file.ce
file.cm
file.cnd
file.cnm
file.err
file.gst
file.json
file.nd*
file.nlh
file.nr*
file.rdb
file.rfl
file.spm
file0.BCS
file0.PCS
file0.ce
file0.cnd
file0.err
file0.gst
file0.nd*
file0.nlh
file0.nr*
frequencies_*.out
input.x17
intermediate*.topo
morphed*.stl
post.out
record.txt
solve*.out
topo.err
topo.out
vars.topo
SecDebugLog.txt
secStart.log
sec.validation.executed
sec.envvarvalidation.executed
sec.failure
*.xml
*.NR*
*.swf
CAERepOutput.xml
Load_*.inp
Mode_mapping_*.txt
NotSupportedElems.dat
ObjectiveHistory.out
PostImage*.png
cyclic_map.json
exit.topo
file*.dsub
file*.ldhi
file*.mntr
file*.png
file*.r0*
file*.r1*
file*.r2*
file*.r3*
file*.r4*
file*.r5*
file*.r6*
file*.r7*
file*.r8*
file*.r9*
file*.rd*
file*.rst
file.BCS
file.DSP
file.PCS
file.ce
file.cm
file.cnd
file.cnm
file.err
file.gst
file.json
file.nd*
file.nlh
file.nr*
file.rdb
file.rfl
file.spm
file0.BCS
file0.PCS
file0.ce
file0.cnd
file0.err
file0.gst
file0.nd*
file0.nlh
file0.nr*
frequencies_*.out
input.x17
intermediate*.topo
morphed*.stl
post.out
record.txt
solve*.out
topo.err
topo.out
vars.topo
SecDebugLog.txt
sec.solverexitcode
secStart.log
sec.failure
sec.envvarvalidation.executed
done
stdout_6aa0b86a-5272-463b-8291-ce057e10320e.rsmout
stderr_6aa0b86a-5272-463b-8291-ce057e10320e.rsmout
control_6aa0b86a-5272-463b-8291-ce057e10320e.rsm
hosts.dat
exitcode_6aa0b86a-5272-463b-8291-ce057e10320e.rsmout
exitcodeCommands_6aa0b86a-5272-463b-8291-ce057e10320e.rsmout
stdout_6aa0b86a-5272-463b-8291-ce057e10320e.live
stderr_6aa0b86a-5272-463b-8291-ce057e10320e.live
ClusterJobCustomization.xml
ClusterJobs.py
clusterjob_6aa0b86a-5272-463b-8291-ce057e10320e.sh
clusterjob_6aa0b86a-5272-463b-8291-ce057e10320e.bat
inquire.request
inquire.confirm
request.upload.rsm
request.download.rsm
wait.download.rsm
scratch.job.rsm
volatile.job.rsm
restart.xml
cancel_6aa0b86a-5272-463b-8291-ce057e10320e.rsmout
liveLogLastPositions_6aa0b86a-5272-463b-8291-ce057e10320e.rsm
stdout_6aa0b86a-5272-463b-8291-ce057e10320e_kill.rsmout
stderr_6aa0b86a-5272-463b-8291-ce057e10320e_kill.rsmout
sec.interrupt
stdout_6aa0b86a-5272-463b-8291-ce057e10320e_*.rsmout
stderr_6aa0b86a-5272-463b-8291-ce057e10320e_*.rsmout
stdout_task_*.live
stderr_task_*.live
control_task_*.rsm
stdout_task_*.rsmout
stderr_task_*.rsmout
exitcode_task_*.rsmout
exitcodeCommands_task_*.rsmout
file.abt
done
RSM_IRON_PYTHON_HOME
/pkg/suse12/software/ANSYS/20.1/v201/aisol/../commonfiles/IronPython
RSM_TASK_WORKING_DIRECTORY
/home/Test/_ProjectScratch/Scr7200
RSM_USE_SSH_LINUX
True
RSM_QUEUE_NAME
local
RSM_CONFIGUREDQUEUE_NAME
Local
RSM_COMPUTE_SERVER_MACHINE_NAME
cl4n007
RSM_HPC_JOBNAME
Mechanical
RSM_HPC_DISPLAYNAME
Wishbone_Test-DP0-Model (C2)-Static Structural (C3)-Solution (C4)
RSM_HPC_CORES
2
RSM_HPC_DISTRIBUTED
TRUE
RSM_HPC_NODE_EXCLUSIVE
FALSE
RSM_HPC_QUEUE
local
RSM_HPC_USER
nxxxxxxxx
RSM_HPC_WORKDIR
/home/Test/_ProjectScratch/Scr7200
RSM_HPC_JOBTYPE
Mechanical_ANSYS
RSM_HPC_ANSYS_LOCAL_INSTALL_DIRECTORY
/pkg/suse12/software/ANSYS/20.1/v201/aisol/..
RSM_HPC_VERSION
201
RSM_HPC_STAGING
/home/Test/_ProjectScratch/Scr7200/
RSM_HPC_LOCAL_PLATFORM
Linux
RSM_HPC_CLUSTER_TARGET_PLATFORM
Linux
RSM_HPC_STDOUTFILE
stdout_6aa0b86a-5272-463b-8291-ce057e10320e.rsmout
RSM_HPC_STDERRFILE
stderr_6aa0b86a-5272-463b-8291-ce057e10320e.rsmout
RSM_HPC_STDOUTLIVE
stdout_6aa0b86a-5272-463b-8291-ce057e10320e.live
RSM_HPC_STDERRLIVE
stderr_6aa0b86a-5272-463b-8291-ce057e10320e.live
RSM_HPC_SCRIPTS_DIRECTORY_LOCAL
/pkg/suse12/software/ANSYS/20.1/v201/aisol/../RSM/Config/scripts
RSM_HPC_SCRIPTS_DIRECTORY
$AWP_ROOT201/RSM/Config/scripts
RSM_HPC_SUBMITHOST
localhost
RSM_HPC_STORAGEID
e0ab39a5-9380-430c-9b38-4303e97fa96d Mechanical=LocalNoCopy#localhost$/home/Test/_ProjectScratch/Scr7200/ Tuesday, November 23, 2021 02:01:40.634 PM True
RSM_HPC_PLATFORMSTORAGEID
/home/Test/_ProjectScratch/Scr7200/
RSM_HPC_NATIVEOPTIONS
ARC_ROOT
/pkg/suse12/software/ANSYS/20.1/v201/aisol/../RSM/Config/scripts/../../ARC
RSM_HPC_KEYWORD
ARC
RSM_PYTHON_LOCALE
en-us
done
AWP_ROOT201
2.0
6aa0b86a-5272-463b-8291-ce057e10320e
November 24, 2021 at 8:36 amKartiSinghFreemanSubscriberI've spent hours on this and I cannot get it to work on more than two cores distributed on the HPC. Like on my local computer I can run more cores using the university licenses. This **** so much.
I don't seem to have this issue with Fluent, but Mechanical it will not work.
Viewing 5 reply threads- The topic ‘ANSYS Workbench HPC MPI Command Line Settings’ is closed to new replies.
Ansys Innovation SpaceTrending discussionsTop Contributors-
1762
-
635
-
599
-
591
-
366
Top Rated Tags© 2025 Copyright ANSYS, Inc. All rights reserved.
Ansys does not support the usage of unauthorized Ansys software. Please visit www.ansys.com to obtain an official distribution.
-