Computation Accleration

- August 26, 2024 at 4:52 pm
  zzhang868
  Subscriber
  Hello,
  I’m looking for advice on how to effectively accelerate computation in ANSYS Mechanical.
  I've tried using a GPU (RTX 6000), and while "Solution Statistics" indicates that the GPU is enabled, there's no noticeable speed improvement. Additionally, increasing the number of CPU cores or nodes hasn't helped. For example:
  - Case 1: I currently use a server with 3 nodes, each with 24 cores. Increasing to 25 nodes doesn’t improve computation time, and reducing the number of cores slows it down further.
  - Case 2: Even with GPU enabled, the computation remains slow.
  Could you advise on the optimal settings for better performance? What might be causing the inefficiencies in these cases?
  Thank you for your kind help.
- August 26, 2024 at 8:22 pm
  
  zzhang868
  Subscriber
  
  I am wondering what difference is between adding "Python Code" under "Transient Analysis" and using "Scripting" at the top. Thanks!
  Option 1:
  
  Option 2:
- August 27, 2024 at 12:53 pm
  
  Ashish Khemka
  Forum Moderator
  
  Hi Zihan,
  
  For the second query can you please create a separate forum thread?
  
  Regards,
  Ashish Khemka
  - August 27, 2024 at 3:11 pm
    
    zzhang868
    Subscriber
    
    Has moved! Could you please have a look at these questions? Any suggestions would be appreciated!
- September 11, 2024 at 5:08 pm
  
  mrife
  Ansys Employee
  
  Hi Zihan
  For the first question let me give a little backgorund first. In distributed parallel solutions, the FEM domain is split into N domains at the element level if using N cpu cores. Then N instances of the solver process (here MAPDL) are used to solve and each solve process takes one of the smaller domains to solve. Each process has to communicate with the other processes that share nodes (FEM nodes, not compute cluster nodes). If the amount of communication gets to be more than the computation that the cpu core is doing, then too many CPU cores are being used. So for any model there will be a point where using more CPU cores will just slow down the solve.
  In the mathematics of solving FEA equations we use degrees of freedom [DOF] instead of number of nodes/elements. Let's say this is a FEM with only solid structural elements with translational DOFs in X, Y and Z. Then the total number of DOFs is 3 * number of nodes.
  In order to know how many cpu cores to use we need to know the range of DOFs per cpu core where the cpu core is most efficient. When working with a new CPU that I've not tested before I assume something like 30,000 DOFs per cpu core and figure out how many cpus and compute nodes are needed from there. Ignoring any question on network communication speed between the compute nodes (for now).
  So for your model I'd try 170 cpu cores or 7 compute nodes as a test. Then maybe try 6 and 8 compute nodes (fully used) to see how hardware responds to a little more and less dof's per cpu core. Depending on what happens you may then want to run other tests.
  For now I'd not use the GPU as a solver accelerator. Since the GPU is helping all of the processes one GPU for 24 cpu cores is a bit much...I'd prefer to see 2 gpus for those 24 cores. Trying to keep to about 1 gpu per 12 or so cpu cores (and hence number of solver processes).
  I'm also assuming the use of the sparse (direct) solver. If using the iterative solver then the dof's per cpu core will probably be different for that cluster.
- September 12, 2024 at 10:27 pm
  
  zzhang868
  Subscriber
  
  Thank you so much for your feedback!