-
-
August 7, 2023 at 4:17 pm
jsarlo
SubscriberI am working with AnsysEM 2022r2. I am a sys admin working with a researcher. We are trying to run a batch job with Slurm and wanting to use multiple compute nodes in the cluster. We have the aedt input file and are trying the following as the execution line of the job script.
ansysedt -ng -batchsolve -dis -mpi -machinelist list=$hl num=$SLURM_NTASKS ${InputFile}
When I watch the compute nodes that get assigned, I only see the first one being used. Nothing ever starts on the second compute node. The $hl list gets built to something like  list=compute-4-53-ib0:48:48:98%,compute-7-19-ib0:48:48:98% I have also tried building the list to being individual 1:1 48 times for each compute node (compute-4-53-ib0:1:1:98%,compute-4-53-ib0:1:1:98%...compute-7-19-ib0:1:1:98%, ...)
Is there something else that needs to be on the command line to use both compute nodes or is there something else that needs to be done?
Jeff
Â
-
August 8, 2023 at 3:27 pm
randyk
Forum ModeratorHi Jeff,Â
Please consider creating the following script. (job.sh for this example)
dos2unix  ./job.sh
chmod +x ./job.sh
sbatch ./job.sh
Modify lines 2-3, 12-13, and 39 as needed.
note1: the value of line 39 "numcores=xx" must match the allocated resource core count.job.sh
#!/bin/bash#SBATCH -N 2        # allocate 2 nodes#SBATCH -n 32       # 32 tasks total#SBATCH -J AnsysEMTest   # sensible name for the job#SBATCH -p default      # partition name##SBATCH --mem 0      #allocates all the memory on the node to the job##SBATCH --time 0##SBATCH --mail-user="user@company.com"##SBATCH --mail-type=ALLÂ# Project Name and setupJobName=OptimTee.aedtAnalysisSetup=""Â# Project locationJobFolder=$(pwd)Â#### Do not modify any items below this line unless requested ####InstFolder=/opt/AnsysEM/v222/Linux64Â#SLURM export ANSYSEM_GENERIC_MPI_WRAPPER=${InstFolder}/schedulers/scripts/utils/slurm_srun_wrapper.sh export ANSYSEM_COMMON_PREFIX=${InstFolder}/common srun_cmd="srun --overcommit --export=ALL -n 1 -N 1 --cpu-bind=none --mem-per-cpu=0 --overlap " # note: srun '--overlap' option was introduced in SLURM VERSION 20.11. If running older SLURM version, remove the '--overlap' argument. export ANSYSEM_TASKS_PER_NODE="${SLURM_TASKS_PER_NODE}"Â# Setup Batchoptionsecho "\$begin 'Config'" > ${JobFolder}/${JobName}.optionsecho "'Desktop/Settings/ProjectOptions/HPCLicenseType'='Pack'" >> ${JobFolder}/${JobName}.optionsecho "'HFSS/RAMLimitPercent'=90" >> ${JobFolder}/${JobName}.optionsecho "'HFSS 3D Layout Design/RAMLimitPercent'=90" >> ${JobFolder}/${JobName}.optionsecho "'HFSS/RemoteSpawnCommand'='scheduler'" >> ${JobFolder}/${JobName}.optionsecho "'HFSS 3D Layout Design/RemoteSpawnCommand'='scheduler'" >> ${JobFolder}/${JobName}.options# If multiple networks on execution host, specify network CIDRÂ# echo "'Desktop/Settings/ProjectOptions/AnsysEMPreferredSubnetAddress'='192.168.1.0/24'" >> ${JobFolder}/${JobName}.optionsecho "\$end 'Config'" >> ${JobFolder}/${JobName}.optionsÂ# Submit AEDT Job (SLURM requires 'srun' and tight integration change to the slurm_srun_wrapper.shÂ${srun_cmd} ${InstFolder}/ansysedt -ng -monitor -waitforlicense -useelectronicsppe=1 -distributed -machinelist numcores=32 -auto -batchoptions ${JobFolder}/${JobName}.options -batchsolve ${AnalysisSetup} ${JobFolder}/${Project} > ${JobFolder}/${JobName}.progress -
October 5, 2023 at 6:59 pm
jsarlo
SubscriberThanks for the suggestion. I did try this with a change to the licensing and I did
#SBATCH -N 2#SBATCH --ntasks-per-node 4Âjust to make sure that it should try to use the 2nd compute node. I still got the same results with no processes showing up on the 2nd node. I have tried this with both 2022r2 and 2023r2. Not sure if there is something missing in the input file or if we just don't have something else with the install configured properly. We are now getting to the point that the users have larger jobs that can fit on just one node and really need to be able to use multiple nodes.ÂIs there something else I need to be checking?ÂJeff -
October 5, 2023 at 7:09 pm
randyk
Forum ModeratorHi Jeff,
Can you please paste the contents of the OptimTee.aedt.batchinfo/OptimTee-xxx.log
-- if you need to, do a replace all for each of the hostnames and project path listed in that file
thanks
Randy -
October 10, 2023 at 12:51 pm
jsarlo
SubscriberThis is what that file has
Ansys Electronics Desktop Version 2023.2.0, Build: 2023-05-16 22:11:08
Location: /share/apps/AnsysEM-2023r2/v232/Linux64/ansysedt.exe
Batch Solve/Save: /project/hpcc/jeff/ansys/maxwell/2node_pipe_2023/Manual_half_pipe.aedt
Starting Batch Run: 1:34:05 PM Â Oct 05, 2023
Temp directory: /tmp
Project directory: /home/jsarlo/Ansoft
[info] Running SLURM job with ID 1878304. Â Command line: "/share/apps/AnsysEM-2023r2/v232/Linux64/ansysedt.exe -ng -monitor -waitforlicense -useelectronicsppe=1 -distributed -machinelist numcores=8 -auto -batchoptions /project/hpcc/jeff/ansys/maxwell/2node_pipe_2023/Manual_half_pipe.aedt.options -batchsolve /project/hpcc/jeff/ansys/maxwell/2node_pipe_2023/Manual_half_pipe.aedt".
Simulation settings:
[info] Simulation settings:Design type: Maxwell 3D
[info]
Design type: Maxwell 3D
Allow off core: False
[info] Allow off core: False
Using automatic settings
[info] Using automatic settings
Optimetrics variations will be solved sequentially.[info] Optimetrics variations will be solved sequentially.
Machines:
[info] Machines:
compute-2-20 [773417 MB]: RAM: 90%, 4 cores, 0 GPUs
compute-2-24 [773417 MB]: RAM: 90%, 4 cores, 0 GPUs[info] compute-2-20 [773417 MB]: RAM: 90%, 4 cores, 0 GPUs
compute-2-24 [773417 MB]: RAM: 90%, 4 cores, 0 GPUs[info] Project:Manual_half_pipe, Setup1 : [PROFILE] Solution Process : Start Time: 10/05/2023 13:34:19, Host: compute-2-20.local, Processor: 48, OS: Linux 4.18.0-477.15.1.el8_8.x86_64, Product: Maxwell 3D 2023.2.0 (01:34:19 PM Â Oct 05, 2023)
[info] Project:Manual_half_pipe, Setup1 : [PROFILE] Â Â Â Â Executing From: /share/apps/AnsysEM-2023r2/v232/Linux64/MAXWELLCOMENGINE.exe (01:34:19 PM Â Oct 05, 2023)
[info] Project:Manual_half_pipe, Setup1 : [PROFILE] HPC : Type: Auto, MPI Vendor: Intel, MPI Version: 2018 (01:34:19 PM Â Oct 05, 2023)
[info] Project:Manual_half_pipe, Setup1 : [PROFILE] Machine 1 : Name: compute-2-20.local, RAM Limit: 90.000000%, Cores: 4 (01:34:19 PM Â Oct 05, 2023)
[info] Project:Manual_half_pipe, Setup1 : [PROFILE] Machine 2 : Name: compute-2-24.local, RAM Limit: 90.000000%, Cores: 4 (01:34:19 PM Â Oct 05, 2023)
[info] Project:Manual_half_pipe, Setup1 : [PROFILE] Stop : Â (01:34:19 PM Â Oct 05, 2023)
[info] Project:Manual_half_pipe, Setup1 : [PROFILE] Design Validation : Level: Perform full validations, Elapsed Time: 00:00:00, Memory: 75.6 M (01:34:19 PM Â Oct 05, 2023)
[info] Project:Manual_half_pipe, Setup1 : [PROFILE] Adaptive Meshing : Time: 10/05/2023 13:34:20 (01:34:20 PM Â Oct 05, 2023)
[info] Project:Manual_half_pipe, Setup1 : [PROFILE] Pass 2 (01:34:20 PM Â Oct 05, 2023)
[info] Project:Manual_half_pipe, Setup1 : [PROFILE] Adaptive Refine : Real Time 00:08:52 : CPU Time 00:09:01 : Memory 3.39 G : Tetrahedra: 4213922, Cores: 1 (01:43:12 PM Â Oct 05, 2023)
[error] Project:Manual_half_pipe, Design:Maxwell3DDesign1 (EddyCurrent), Unable to create child process: 3dedy. Please contact Ansys technical support. -- Simulating on machine: compute-2-20 (01:45:15 PM Â Oct 05, 2023)
[info] Project:Manual_half_pipe, Setup1 : [PROFILE] Stop : Â (01:45:15 PM Â Oct 05, 2023)
[info] Project:Manual_half_pipe, Setup1 : [PROFILE] Stop : Elapsed Time: 00:10:55 (01:45:15 PM Â Oct 05, 2023)
[info] Project:Manual_half_pipe, Setup1 : [PROFILE] Â Â Â Â Unable to create child process: 3dedy. Please contact Ansys technical support. (01:45:15 PM Â Oct 05, 2023)
[info] Project:Manual_half_pipe, Setup1 : [PROFILE] Â Â Â Â Stop Time: 10/05/2023 13:45:15, Status: Engine Detected Error (01:45:15 PM Â Oct 05, 2023)
[info] Project:Manual_half_pipe, Setup1 : [PROFILE] Stop : Elapsed Time: 00:10:56, ComEngine Memory: 90.6 M (01:45:15 PM Â Oct 05, 2023)
[error] Project:Manual_half_pipe, Design:Maxwell3DDesign1 (EddyCurrent), Simulation completed with execution error on server: compute-2-20. (01:45:16 PM Â Oct 05, 2023)
Stopping Batch Run: 1:45:23 PM Â Oct 05, 2023 -
October 10, 2023 at 12:52 pm
jsarlo
SubscriberThis was from the 2023 version test.
-
- The topic ‘ansysedt in batch using MPI’ is closed to new replies.
-
3492
-
1057
-
1051
-
966
-
942
© 2025 Copyright ANSYS, Inc. All rights reserved.