TAGGED: cfd-dem, coupled-simulation, rocky-dem
-
-
November 10, 2025 at 9:28 pm
parsaary
SubscriberHi,
I need some help for speeding up a 2-way coupled fluent-rocky simulation being run on an HPC. Currently using a single node since I think rocky doesn't work on multinode. The current allocation is 4 gpu's for rocky, 20 cpu's for fluent, and the remaining 4 cpu's in the node are left free to be used for data transfer between rocky and fluent. I don't know what the bottle neck is, but I have pasted below an exerpt from the fluent and rocky logs at the same flow time. Thanks for your help!
Fluent log:
...Signal received from Rocky.
...Reading Rocky data.
...Rocky data read message sent!
Flow time = 0.0005170320000002374s, time step = 25060
********************************************************************
Elapsed time = 4099.916666666666 s
********************************************************************
/solve/dual-time-iterate 1 50
Updating solution at time level N...
done.
iter continuity u-water u-particle v-water v-particle w-water w-particle k-water k-particle eps-water eps-partic vf-particl time/iter
30380 2.4872e-06 3.0265e-07 0.0000e+00 2.1331e-07 0.0000e+00 3.5284e-08 0.0000e+00 2.4211e-05 0.0000e+00 1.6416e-09 0.0000e+00 1.1737e-05 0:10:07 50
...ReceiveFluentDataReadMessage Received
...Exporting Fluent flow data to Rocky!
...Exporting Reference Density = 1.23e+00
...Fluent flow data written to Rocky!
...Fluent data sent to Rocky! 30381 2.5017e-06 3.0370e-07 0.0000e+00 2.1372e-07 0.0000e+00 3.5308e-08 0.0000e+00 2.4211e-05 0.0000e+00 1.6415e-09 0.0000e+00 1.1737e-05 0:09:55 49
!30381 solution is converged
...Signal received from Rocky.
...Reading Rocky data.
...Rocky data read message sent!
Flow time = 0.0005170492000002374s, time step = 25061
********************************************************************
Elapsed time = 4100.716666666667 s
Rocky Log:
message
message date="2025-10-23 17:09:04" up_time="11:19:36.149"CFDCoupling: Send CFD data
Rocky current time = 5.17032e-05 iteration: 24048 last output time: 1.72172e-05
CFD flow time: 1.72e-08 iteration: 0
message
message date="2025-10-23 17:09:04" up_time="11:19:36.150"Sending fluent message
message
message date="2025-10-23 17:09:08" up_time="11:19:40.675"Send Rocky Data Write Message.
message
message date="2025-10-23 17:09:08" up_time="11:19:40.729"Waiting for fluent message
message
message date="2025-10-23 17:09:16" up_time="11:19:48.482"Fluent write data message received.
message
message date="2025-10-23 17:09:17" up_time="11:19:49.283"Sent Fluent Data Read Message.
message
message date="2025-10-23 17:09:17" up_time="11:19:49.283"CFDCoupling: Received CFD data. Target DT = 1.72e-08
message
message date="2025-10-23 17:09:17" up_time="11:19:49.535"CFDCoupling: Send CFD data
Rocky current time = 5.17204e-05 iteration: 24056 last output time: 1.72172e-05
CFD flow time: 1.72e-08 iteration: 0
-
November 17, 2025 at 11:24 am
Jackson Gomes
Ansys EmployeeDear Parsary,
If the bottleneck is on the Fluent side, you can allocate more CPU cores and even run it in distributed mode. If the bottleneck is on the Rocky side, you can allocate more GPUs to increase performance. For additional details on Rocky performance and hardware recommendations, please refer to the Rocky GPU Buying Guide | Ansys Knowledge available in the Ansys Knowledge resources.
Hope this helps!
If you’d like to explore more learning materials about Rocky software, you can find them here: https://innovationspace.ansys.com/ais-rocky/
Warm Regards,
Jackson
-
November 18, 2025 at 4:25 am
parsaary
SubscriberHi Jackson,
Thank you very much for your help, I think the bottleneck is down to the fluent portion of the sim. Usually I run a sim on 1 hpc node (each node has 4 gpu's, which i allocate to rocky, and 24 cpu's, 20 of which i allocate to fluent with local parallel, and 4 cpu's I leave unrequested so they can handle data transfer etc)
However, am i correct in understanding that rocky cannot do multinode? in that case, would i only be using the gpu's from node 1 for rocky? e.g. when using 2 of my usual hpc nodes, 40 cpu's (20/node) would go to fluent, 8 cpu's (4/node) would be left unrequested (data transfer etc), but still only 4 gpu's (from one node) could be requested for rocky, leaving the 4 gpu's from node 2 unused...
if my above understanding is correct and only gpu's from 1 node can be used for rocky, would it be a possibility for using only 1 such hpc node (with both gpu's and cpu's), and request the second node to be a simple cpu node? I ask this since my university has many more cpu nodes than cpu+gpu nodes, so the queue times would be much shorter if i could employ this type of parallel distribution. Thank you very much for your help.
Best regards,
Parsa
-
-
- You must be logged in to reply to this topic.
- air flow in and out of computer case
- Varying Bond model parameters to mimic soil particle cohesion/stiction
- Eroded Mass due to Erosion of Soil Particles by Fluids
- I am doing a corona simulation. But particles are not spreading.
- Issue to compile a UDF in ANSYS Fluent
- Guidance needed for Conjugate Heat Transfer Analysis for a 3s3p Li-ion Battery
- JACOBI Convergence Issue in ANSYS AQWA
- affinity not set
- Resuming SAG Mill Simulation with New Particle Batch in Rocky
- Continuing SAG Mill Simulation with New Particle Batch in Rocky
-
4492
-
1494
-
1376
-
1209
-
1021
© 2025 Copyright ANSYS, Inc. All rights reserved.