TAGGED: 2-way-fsi, cluster, hpc, linux, system-coupling
-
-
December 6, 2021 at 8:28 am
jfonken
SubscriberHi all,
I'm running into troubles with my FSI simulations using Ansys System Coupling on a Linux Centos8 HPC cluster with openMPI. The SLURM environment is used to schedule the jobs. Ansys 2021R2 is used. My simulation runs fine on my own laptop, but gives an error message when running on the cluster. I will attach the error message in a comment, since it's too long to fit in the question section. Both Fluent as Mechanical do not give an error message before the system coupling process is interrupted.
The debug information from system coupling shows that it is able to get the Fluent mesh and also the nodes and elements of the mechanical mesh, but the trace stops after obtaining the elements (see second command).
December 6, 2021 at 8:28 amjfonken
SubscriberError message from System Coupling:
|Build Information|
+-----------------------------------------------------------------------------+
| System Coupling|
|2021 R2: Build ID: 5b5c87f Build Date: 21 May 2021 08:38:06|
| Fluid Flow (Fluent)|
|ANSYS Fluent 21.2.0, Build Time:May 28 2021 13:48:13 EDT, Build Id:10201, |
|OS Version:lnamd64|
| MAPDL Transient|
|Mechanical APDL Release 2021 R2Build 21.2UP20210601|
|DISTRIBUTED LINUX x64Version|
+=============================================================================+
===============================================================================
+=============================================================================+
||
|Analysis Initialization|
||
+=============================================================================+
===============================================================================
sched_setaffinity() call failed: Invalid argument
sched_setaffinity() call failed: Invalid argument
make: *** No rule to make target 'clean'.Stop.
[tcn362:762304] *** An error occurred in MPI_Gatherv
[tcn362:762304] *** reported by process [2322595841,1]
[tcn362:762304] *** on communicator MPI_COMM_WORLD
[tcn362:762304] *** MPI_ERR_ARG: invalid argument of some other kind
[tcn362:762304] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort [tcn362:762304] ***and potentially your MPI job)
Error in TcpCommunicatingSocket::recv End of file
Error in TcpCommunicatingSocket::send Broken pipe
==============================================================================
Stack backtrace generated for process id 762959 on signal 11 :
Error in TcpCommunicatingSocket::send Broken pipe
1000000: fluent() [0x785ff9]
1000000: /lib64/libc.so.6(+0x37400) [0x14a630403400]
1000000: fluent(RpcNextArgAsInt32+0) [0xb274d0]
1000000: fluent(Get_Restart_Initial_Step_Index+0x66) [0x872ee6]
1000000: fluent() [0x8245a6]
1000000: fluent(eval+0x4b5) [0x8c1955]
1000000: fluent(eval+0x6cd) [0x8c1b6d]
1000000: fluent(eval+0xd21) [0x8c21c1]
1000000: fluent(eval+0xd21) [0x8c21c1]
1000000: fluent(eval+0xd21) [0x8c21c1]
1000000: fluent(eval+0xd21) [0x8c21c1]
1000000: fluent() [0x8c29b6]
1000000: fluent(eval_errprotect+0x4e) [0x8c308e]
1000000: fluent(eval+0x21f) [0x8c16bf]
1000000: fluent(eval+0xd21) [0x8c21c1]
Please include this information with any bug report you file on this issue!
==============================================================================
Error in TcpCommunicatingSocket::send Broken pipe
Error in TcpCommunicatingSocket::send Broken pipe
Error in TcpCommunicatingSocket::send Broken pipe
Error in TcpCommunicatingSocket::send Broken pipe
Error in TcpCommunicatingSocket::send Broken pipe
+-----------------------------------------------------------------------------+
| Failed to retrieve mesh(es).|
+-----------------------------------------------------------------------------+
Error in TcpCommunicatingSocket::send Broken pipe
Error in TcpCommunicatingSocket::send Broken pipe
Error in TcpCommunicatingSocket::send Broken pipe
Error in TcpCommunicatingSocket::send Broken pipe
Error in TcpCommunicatingSocket::send Broken pipe
Error: Cortex received a fatal signal (SEGMENTATION VIOLATION).
Error Object: Error in TcpCommunicatingSocket::send Broken pipe
Error in TcpCommunicatingSocket::send Broken pipe
Error in TcpCommunicatingSocket::send Broken pipe
Error in TcpCommunicatingSocket::send Broken pipe
Error in TcpCommunicatingSocket::send Broken pipe
Error in TcpCommunicatingSocket::send Broken pipe
Error in TcpCommunicatingSocket::send Broken pipe
Error in TcpCommunicatingSocket::send Broken pipe
Traceback (most recent call last):
File "PyLib/physicscoupling/importer/__init__.py", line 40, in importMesh
File "PyLib/kernel/util/Memory.py", line 177, in wrapper
File "PyLib/physicscoupling/importer/__init__.py", line 121, in buildMesh
File "PyLib/physicscoupling/importer/__init__.py", line 158, in _getFaceAndCellZoneIds
AttributeError: 'tuple' object has no attribute 'items'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "PyLib/main/Controller.py", line 147, in
File "PyLib/main/Controller.py", line 143, in _run
File "PyLib/main/Controller.py", line 92, in _executeScript
File "PyLib/kernel/commands/__init__.py", line 31, in readScriptFile
File "PyLib/kernel/commands/CommandManager.py", line 169, in readScriptFile
File "runFSI.txt", line 49, in
Solve File "PyLib/kernel/commands/CommandDefinition.py", line 74, in func
File "PyLib/kernel/commands/__init__.py", line 28, in executeCommand
File "PyLib/kernel/commands/CommandManager.py", line 122, in executeCommand
File "PyLib/cosimulation/externalinterface/core/solver.py", line 125, in execute
File "PyLib/cosimulation/solver/__init__.py", line 123, in solve
File "PyLib/kernel/util/Memory.py", line 177, in wrapper
File "PyLib/cosimulation/solver/__init__.py", line 526, in __initializeControlled
File "PyLib/cosimulation/solver/__init__.py", line 796, in __importMesh
File "PyLib/kernel/util/Memory.py", line 177, in wrapper
File "PyLib/cosimulation/solver/__init__.py", line 807, in __importMeshAndCreateZonesForFmu
File "PyLib/physicscoupling/importer/__init__.py", line 43, in importMesh
RuntimeError: Failed to retrieve mesh(es).
Shutting down System Coupling compute node processes.
Error in TcpCommunicatingSocket::send Broken pipe
Some compute-node processes or machines have crashed.
Host process lost connection while reading. Fatal error!
999999 (../../src/mpsystem.c@1221): mpt_read: failed: errno = 0
999999: mpt_read: error: read failed trying to read 4 bytes: Success
December 7, 2021 at 11:25 amjfonken
SubscriberDebug log (last part):
SndReq (FLUENT-1) MeshData::GetNodes
Req: 5356, 0, 110007
CTrace (3): Leaving makeRemoteCall
Rsp: [3153, 3156, 3158, 3160, 3163, ... (n=110007; min=3153; max=114098) ], [-0.00616212, 0.00786893, -8.32667e-17,
-0.0029959, 0.00953511, ... (n=330021; min=-0.0253699; max=0.168801; mean=0.0327284) ]
CTrace (3): Entering makeRemoteCall
SndReq (FLUENT-1) FaceMeshData::GetElementCount
Req: 5356, -1
CTrace (3): Leaving makeRemoteCall
Rsp: 219697
CTrace (3): Entering makeRemoteCall
SndReq (FLUENT-1) FaceMeshData::GetElements
Req: 5356, -1, 0, 219697
CTrace (3): Leaving makeRemoteCall
Rsp: [5767673, 5767674, 5767675, 5767676, 5767677, ... (n=219697; min=5767673; max=5987369) ], [1825839, 1769492,
1826561, 1828079, 1821171, ... (n=219697; min=48; max=2061779) ], [0, 0, 0, 0, 0, ... (n=219697; min=0; max=0) ],
[4411, 4410, 4409, 4414, 4413, ... (n=659091; min=3153; max=114098) ], [3, 3, 3, 3, 3, ... (n=219697; min=3; max=3) ]
CTrace (3): Entering makeRemoteCall
SndReq (FLUENT-1) RegionFilter::DeleteFilter
Req: 1
CTrace (3): Leaving makeRemoteCall
CTrace (2): Leaving fillRegionData
CTrace (1): Leaving loadRegions
CTrace (1): Entering loadRegions
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) RegionFilter::NewFilter
CTrace (2): Leaving makeRemoteCall
Rsp: 1
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) RegionFilter::SetRegionName
Req: 1, FSIN_1
CTrace (2): Leaving makeRemoteCall
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) RegionInfo::ApplyFilter
Req: 1
CTrace (2): Leaving makeRemoteCall
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) RegionInfo::GetIds
CTrace (2): Leaving makeRemoteCall
Rsp: [1]
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) RegionInfo::GetTopolDimension
CTrace (2): Leaving makeRemoteCall
Rsp: [2]
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) MeshInfo::GetUnits
Req: 1
CTrace (2): Leaving makeRemoteCall
Rsp: [0]
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) MeshData::GetNodeCount
Req: 1
CTrace (2): Leaving makeRemoteCall
Rsp: 68580
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) MeshData::GetNodes
Req: 1, 0, 68580
CTrace (2): Leaving makeRemoteCall
CTrace (2): Entering convertUnits
CTrace (2): Leaving convertUnits
Rsp: [62366, 62368, 62681, 62679, 62367, ... (n=68580; min=1892; max=250040) ], [-0.01, -2.6159e-18, -8.32667e-17,
-0.00999893, 3.49372e-05, ... (n=205740; min=-0.025373; max=0.168801; mean=0.0308368) ]
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) MeshData::GetElementCount
Req: 1, -1
CTrace (2): Leaving makeRemoteCall
Rsp: 22788
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) MeshData::GetElements
Req: 1, -1, 0, 22788
CTrace (2): Leaving makeRemoteCall
Rsp: [8, 8, 8, 8, 8, ... (n=22788; min=8; max=8) ], [8, 8, 8, 8, 8, ... (n=22788; min=8; max=8) ], [62366,
62368, 62681, 62679, 62367, ... (n=182304; min=1892; max=250040) ]
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) RegionFilter::DeleteFilter
Req: 1
CTrace (2): Leaving makeRemoteCall
CTrace (1): Leaving loadRegions
CTrace (1): Entering getAndDumpNodes
CTrace (2): Entering receiveRegionNodes
CTrace (2): Leaving receiveRegionNodes
CTrace (2): Entering receiveRegionNodes
CTrace (3): Entering getNodes
CTrace (3): Leaving getNodes
CTrace (3): Entering getNodes
CTrace (3): Leaving getNodes
CTrace (2): Leaving receiveRegionNodes
CTrace (2): Entering setup
CTrace (2): Leaving setup
CTrace (2): Entering collect
CTrace (2): Leaving collect
CTrace (2): Entering fillNodes
CTrace (2): Leaving fillNodes
CTrace (1): Leaving getAndDumpNodes
CTrace (1): Entering getAndDumpCells
December 7, 2021 at 11:25 amjfonken
SubscriberDebugging log (last part):
SndReq (FLUENT-1) MeshData::GetNodes
Req: 5356, 0, 110007
CTrace (3): Leaving makeRemoteCall
Rsp: [3153, 3156, 3158, 3160, 3163, ... (n=110007; min=3153; max=114098) ], [-0.00616212, 0.00786893, -8.32667e-17,
-0.0029959, 0.00953511, ... (n=330021; min=-0.0253699; max=0.168801; mean=0.0327284) ]
CTrace (3): Entering makeRemoteCall
SndReq (FLUENT-1) FaceMeshData::GetElementCount
Req: 5356, -1
CTrace (3): Leaving makeRemoteCall
Rsp: 219697
CTrace (3): Entering makeRemoteCall
SndReq (FLUENT-1) FaceMeshData::GetElements
Req: 5356, -1, 0, 219697
CTrace (3): Leaving makeRemoteCall
Rsp: [5767673, 5767674, 5767675, 5767676, 5767677, ... (n=219697; min=5767673; max=5987369) ], [1825839, 1769492,
1826561, 1828079, 1821171, ... (n=219697; min=48; max=2061779) ], [0, 0, 0, 0, 0, ... (n=219697; min=0; max=0) ],
[4411, 4410, 4409, 4414, 4413, ... (n=659091; min=3153; max=114098) ], [3, 3, 3, 3, 3, ... (n=219697; min=3; max=3) ]
CTrace (3): Entering makeRemoteCall
SndReq (FLUENT-1) RegionFilter::DeleteFilter
Req: 1
CTrace (3): Leaving makeRemoteCall
CTrace (2): Leaving fillRegionData
CTrace (1): Leaving loadRegions
CTrace (1): Entering loadRegions
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) RegionFilter::NewFilter
CTrace (2): Leaving makeRemoteCall
Rsp: 1
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) RegionFilter::SetRegionName
Req: 1, FSIN_1
CTrace (2): Leaving makeRemoteCall
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) RegionInfo::ApplyFilter
Req: 1
CTrace (2): Leaving makeRemoteCall
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) RegionInfo::GetIds
CTrace (2): Leaving makeRemoteCall
Rsp: [1]
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) RegionInfo::GetTopolDimension
CTrace (2): Leaving makeRemoteCall
Rsp: [2]
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) MeshInfo::GetUnits
Req: 1
CTrace (2): Leaving makeRemoteCall
Rsp: [0]
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) MeshData::GetNodeCount
Req: 1
CTrace (2): Leaving makeRemoteCall
Rsp: 68580
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) MeshData::GetNodes
Req: 1, 0, 68580
CTrace (2): Leaving makeRemoteCall
CTrace (2): Entering convertUnits
CTrace (2): Leaving convertUnits
Rsp: [62366, 62368, 62681, 62679, 62367, ... (n=68580; min=1892; max=250040) ], [-0.01, -2.6159e-18, -8.32667e-17,
-0.00999893, 3.49372e-05, ... (n=205740; min=-0.025373; max=0.168801; mean=0.0308368) ]
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) MeshData::GetElementCount
Req: 1, -1
CTrace (2): Leaving makeRemoteCall
Rsp: 22788
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) MeshData::GetElements
Req: 1, -1, 0, 22788
CTrace (2): Leaving makeRemoteCall
Rsp: [8, 8, 8, 8, 8, ... (n=22788; min=8; max=8) ], [8, 8, 8, 8, 8, ... (n=22788; min=8; max=8) ], [62366,
62368, 62681, 62679, 62367, ... (n=182304; min=1892; max=250040) ]
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) RegionFilter::DeleteFilter
Req: 1
CTrace (2): Leaving makeRemoteCall
CTrace (1): Leaving loadRegions
CTrace (1): Entering getAndDumpNodes
CTrace (2): Entering receiveRegionNodes
CTrace (2): Leaving receiveRegionNodes
CTrace (2): Entering receiveRegionNodes
CTrace (3): Entering getNodes
CTrace (3): Leaving getNodes
CTrace (3): Entering getNodes
CTrace (3): Leaving getNodes
CTrace (2): Leaving receiveRegionNodes
CTrace (2): Entering setup
CTrace (2): Leaving setup
CTrace (2): Entering collect
CTrace (2): Leaving collect
CTrace (2): Entering fillNodes
CTrace (2): Leaving fillNodes
CTrace (1): Leaving getAndDumpNodes
CTrace (1): Entering getAndDumpCells
December 7, 2021 at 11:25 amjfonken
SubscriberDebugging log (last part)
SndReq (FLUENT-1) MeshData::GetNodes
Req: 5356, 0, 110007
CTrace (3): Leaving makeRemoteCall
Rsp: [3153, 3156, 3158, 3160, 3163, ... (n=110007; min=3153; max=114098) ], [-0.00616212, 0.00786893, -8.32667e-17,
-0.0029959, 0.00953511, ... (n=330021; min=-0.0253699; max=0.168801; mean=0.0327284) ]
CTrace (3): Entering makeRemoteCall
SndReq (FLUENT-1) FaceMeshData::GetElementCount
Req: 5356, -1
CTrace (3): Leaving makeRemoteCall
Rsp: 219697
CTrace (3): Entering makeRemoteCall
SndReq (FLUENT-1) FaceMeshData::GetElements
Req: 5356, -1, 0, 219697
CTrace (3): Leaving makeRemoteCall
Rsp: [5767673, 5767674, 5767675, 5767676, 5767677, ... (n=219697; min=5767673; max=5987369) ], [1825839, 1769492,
1826561, 1828079, 1821171, ... (n=219697; min=48; max=2061779) ], [0, 0, 0, 0, 0, ... (n=219697; min=0; max=0) ],
[4411, 4410, 4409, 4414, 4413, ... (n=659091; min=3153; max=114098) ], [3, 3, 3, 3, 3, ... (n=219697; min=3; max=3) ]
CTrace (3): Entering makeRemoteCall
SndReq (FLUENT-1) RegionFilter::DeleteFilter
Req: 1
CTrace (3): Leaving makeRemoteCall
CTrace (2): Leaving fillRegionData
CTrace (1): Leaving loadRegions
CTrace (1): Entering loadRegions
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) RegionFilter::NewFilter
CTrace (2): Leaving makeRemoteCall
Rsp: 1
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) RegionFilter::SetRegionName
Req: 1, FSIN_1
CTrace (2): Leaving makeRemoteCall
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) RegionInfo::ApplyFilter
Req: 1
CTrace (2): Leaving makeRemoteCall
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) RegionInfo::GetIds
CTrace (2): Leaving makeRemoteCall
Rsp: [1]
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) RegionInfo::GetTopolDimension
CTrace (2): Leaving makeRemoteCall
Rsp: [2]
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) MeshInfo::GetUnits
Req: 1
CTrace (2): Leaving makeRemoteCall
Rsp: [0]
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) MeshData::GetNodeCount
Req: 1
CTrace (2): Leaving makeRemoteCall
Rsp: 68580
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) MeshData::GetNodes
Req: 1, 0, 68580
CTrace (2): Leaving makeRemoteCall
CTrace (2): Entering convertUnits
CTrace (2): Leaving convertUnits
Rsp: [62366, 62368, 62681, 62679, 62367, ... (n=68580; min=1892; max=250040) ], [-0.01, -2.6159e-18, -8.32667e-17,
-0.00999893, 3.49372e-05, ... (n=205740; min=-0.025373; max=0.168801; mean=0.0308368) ]
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) MeshData::GetElementCount
Req: 1, -1
CTrace (2): Leaving makeRemoteCall
Rsp: 22788
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) MeshData::GetElements
Req: 1, -1, 0, 22788
CTrace (2): Leaving makeRemoteCall
Rsp: [8, 8, 8, 8, 8, ... (n=22788; min=8; max=8) ], [8, 8, 8, 8, 8, ... (n=22788; min=8; max=8) ], [62366,
62368, 62681, 62679, 62367, ... (n=182304; min=1892; max=250040) ]
CTrace (2): Entering makeRemoteCall
SndReq (MAPDL-2) RegionFilter::DeleteFilter
Req: 1
CTrace (2): Leaving makeRemoteCall
CTrace (1): Leaving loadRegions
CTrace (1): Entering getAndDumpNodes
CTrace (2): Entering receiveRegionNodes
CTrace (2): Leaving receiveRegionNodes
CTrace (2): Entering receiveRegionNodes
CTrace (3): Entering getNodes
CTrace (3): Leaving getNodes
CTrace (3): Entering getNodes
CTrace (3): Leaving getNodes
CTrace (2): Leaving receiveRegionNodes
CTrace (2): Entering setup
CTrace (2): Leaving setup
CTrace (2): Entering collect
CTrace (2): Leaving collect
CTrace (2): Entering fillNodes
CTrace (2): Leaving fillNodes
CTrace (1): Leaving getAndDumpNodes
CTrace (1): Entering getAndDumpCells
December 16, 2021 at 6:17 amUlrich
Ansys EmployeeHi Judith let's try to narrow down the problem.
Can you run Fluent alone on the linux cluster, standalone and/or via Workbench?
And the same question for Mechanical
Can you run Mechanical alone on the linux cluster, standalone and/or via Workbench?
Regards
Ulrich S.
December 16, 2021 at 6:54 amjfonken
SubscriberHi Ulrich Thanks for your reply and sorry that I didn't include this information. Both Ansys Fluent as Ansys Mechanical (both 2021R2 versions) run fine no the cluster, using the openmpi option.
Best Judith
December 17, 2021 at 12:50 pmUlrich
Ansys EmployeeHi Judith Thanks for confirming that Fluent and Mechanical work standalone.
Hence, you seem to face a special problem with system coupling on a linux cluster with SLURM.
How are you running System Coupling, via Workbench or in standalone (e. g. via CLI)?
In this context, I have only found the chapter ÔÇ£Using Parallel Processing CapabilitiesÔÇØ in the ÔÇ£System Coupling User's GuideÔÇØ (https://ansyshelp.ansys.com/account/secured?returnurl=/Views/Secured/corp/v212/en/sysc_ug/sysc_userinterfaces_advtasks_parallel.html?q=linux) so far.
I will try to find more.
Best Regards Ulrich S.
December 17, 2021 at 1:31 pmjfonken
SubscriberHi Ulrich,
I run System Coupling as a standalone program. I call it using:
"/sw/arch/Centos8/EB_production/2021/software/ANSYS/2021R2/v212/SystemCoupling/bin/systemcoupling" --mpi openmpi --cnf=${NODEFILE} -R runFSI.txt -l3
The contents of the runFSI.txt file is:
# Load participants
AddParticipant(InputFile = 'fluent.scp')
AddParticipant(InputFile = 'structural.scp')
# Create coupling interface
AddInterface(SideOneParticipant = 'FLUENT-1',SideOneRegions = ['wall_lumen'],SideTwoParticipant = 'MAPDL-2',SideTwoRegions = ['FSIN_1'])
# Add data transfers
# Data transfer 1
AddDataTransfer(Interface = 'Interface-1',TargetSide = 'Two',SideOneVariable = 'force', SideTwoVariable = 'FORC')
# Data transfer 2
AddDataTransfer(Interface = 'Interface-1',TargetSide = 'One',SideOneVariable = 'displacement',SideTwoVariable = 'INCD')
# Set participant execution controls
execCon = DatamodelRoot().CouplingParticipant
execCon['FLUENT-1'].ExecutionControl.ParallelFraction=1.0/6.0
execCon['FLUENT-1'].ExecutionControl.AdditionalArguments = '-mpi=openmpi -meshing -gu'
execCon['MAPDL-2'].ExecutionControl.ParallelFraction=5.0/6.0
execCon['MAPDL-2'].ExecutionControl.AdditionalArguments = '-mpi openmpi'
# Analysis settings
DatamodelRoot().SolutionControl.MinimumIterations = 1
DatamodelRoot().SolutionControl.MaximumIterations = 20
DatamodelRoot().SolutionControl.TimeStepSize = 0.005
DatamodelRoot().SolutionControl.EndTime = 2.4
# Add stabilization
DataTrans1 = DatamodelRoot().CouplingInterface['Interface-1'].DataTransfer['displacement']
DataTrans1.ConvergenceTarget = 0.01
DataTrans1.Stabilization.Option = 'Quasi-Newton'
DataTrans1.Stabilization.MaximumRetainedTimeSteps = 1
DataTrans1.Stabilization.InitialRelaxationFactor = 0.1
DataTrans1.PrintState DataTrans2 = DatamodelRoot().CouplingInterface['Interface-1'].DataTransfer['FORC']
DataTrans2.ConvergenceTarget = 0.01
DataTrans2.Stabilization.Option = 'None'
DataTrans2.PrintState # Create restart points at every 5 time steps
DatamodelRoot().OutputControl.Option = 'StepInterval'
DatamodelRoot().OutputControl.OutputFrequency = '5'
Solve
You can also view all my input and output files in the .7p file that I've attached to my initial post.
Best Judith
January 11, 2022 at 9:27 amPaul Hutcheson
Ansys EmployeeHi Judith Please check the line:
execCon['FLUENT-1'].ExecutionControl.AdditionalArguments = '-mpi=openmpi -meshing -gu'
"-meshing" launches Fluent Meshing, which must be a mistake and could be the reason system coupling fails during mapping because it expects Fluent solver. Can you remove "-meshing" and try again?
Paul
January 11, 2022 at 9:48 amjfonken
SubscriberHi Paul I noticed this mistake in my input file a while ago as well and corrected it. I forgot to update it on the forum. Unfortunately, removing "-meshing" didn't resolve my problem. Would you have any other suggestions?
Best Judith
January 11, 2022 at 11:07 amPaul Hutcheson
Ansys EmployeeHi Judith I'm not sure the cause of the error yet then. Debug files would need to be interpreted by a developer.
Did you try the default MPI, by removing all MPI options?
Note also System Coupling can read SLURM environment variables if first set by a bash script, where at the end of the script SyC is launched with:
"/sw/arch/Centos8/EB_production/2021/software/ANSYS/2021R2/v212/SystemCoupling/bin/systemcoupling" -R runFSI.txt -s3
Note the core count for System Coupling is set with "-sN" not "-lN" where N is core count.
Paul
January 11, 2022 at 12:58 pmjfonken
SubscriberHi Paul I tried the default MPI as well, but without any success unfortunately.
I launch SyC with:
"/sw/arch/Centos8/EB_production/2021/software/ANSYS/2021R2/v212/SystemCoupling/bin/systemcoupling" --mpi openmpi --cnf=${NODEFILE} -R runFSI.txt -l3
In which I use the --cnf flag to assign a certain number of cores. I also tried it with the -s and -t flags, but the simulation only works when only 1 core is used for Fluent and 1 for Mechanical APDL. The -l flag was set to get debug output.
Best Judith
January 11, 2022 at 1:46 pmPaul Hutcheson
Ansys EmployeeHi Judith Ok, that's interesting that you say it worked when 1 core was assigned to each solver.
runFSI should be a python script, better with extension .py. I haven't tested whether extension .txt will be interpreted as a python script. Likely though if it works with 1 core per solver this is not a problem.
Since it seems to be happy with 1 core each but not more, I think we should look at the job submission method and core assignment. Have you got a SLURM submission script that sets environmental variables? Here is an example you can use for SyC jobs on SLURM:
#!/bin/bash -l
#
# Set slurm options as needed
#
#SBATCH --job-name SYSC
#SBATCH --nodes=2
#SBATCH --partition=ottc02
#SBATCH --ntasks-per-node=32
#SBATCH --output=%x-%j.out
#SBATCH --error=%x-%j.err
#SBATCH --export=ALL
export AWP_ROOT212=/sw/arch/Centos8/EB_production/2021/software/ANSYS/2021R2/v212
#
export SYSC_ROOT=${AWP_ROOT212}/SystemCoupling
#
# print job start time and Slurm job resources
#
date
echo "SLURM_JOB_ID : "$SLURM_JOB_ID
echo "SLURM_JOB_NODELIST : "$SLURM_JOB_NODELIST
echo "SLURM_JOB_NUM_NODES : "$SLURM_JOB_NUM_NODES
echo "SLURM_NODELIST : "$SLURM_NODELIST
echo "SLURM_NTASKS : "$SLURM_NTASKS
echo "SLURM_TASKS_PER_NODE : "$SLURM_TASKS_PER_NODE
echo "working directory : "$SLURM_SUBMIT_DIR
#
echo "Running System Coupling"
echo "System coupling main execution host is $HOSTNAME"
echo "Current working directory is $PWD"
#echo "ANSYS install root is $AWP_ROOT212"
echo "System coupling root is $SYSC_ROOT"
echo "Run script is $1"
echo
"$SYSC_ROOT/bin/systemcoupling" -R runFSI.txt
you'd have to change job name, nodes, partition and ntasks-per-node.
Paul
Viewing 13 reply threads- The topic ‘(MPI?) problem when running System Coupling on Linux HPC cluster’ is closed to new replies.
Ansys Innovation SpaceTrending discussionsTop Contributors-
3492
-
1057
-
1051
-
965
-
942
Top Rated Tags© 2025 Copyright ANSYS, Inc. All rights reserved.
Ansys does not support the usage of unauthorized Ansys software. Please visit www.ansys.com to obtain an official distribution.
-