General Mechanical

General Mechanical

Topics relate to Mechanical Enterprise, Motion, Additive Print and more

Computation Accleration

    • zzhang868
      Bbp_participant

      Hello,

      I’m looking for advice on how to effectively accelerate computation in ANSYS Mechanical.

      I've tried using a GPU (RTX 6000), and while "Solution Statistics" indicates that the GPU is enabled, there's no noticeable speed improvement. Additionally, increasing the number of CPU cores or nodes hasn't helped. For example:

      • Case 1: I currently use a server with 3 nodes, each with 24 cores. Increasing to 25 nodes doesn’t improve computation time, and reducing the number of cores slows it down further.
      • Case 2: Even with GPU enabled, the computation remains slow.

      Could you advise on the optimal settings for better performance? What might be causing the inefficiencies in these cases?

      Thank you for your kind help.


    • zzhang868
      Bbp_participant

      I am wondering what difference is between adding "Python Code" under "Transient Analysis" and using "Scripting" at the top. Thanks!
      Option 1:


      Option 2:

    • Ashish Khemka
      Forum Moderator

      Hi Zihan,

       

      For the second query can you please create a separate forum thread?

       

      Regards,

      Ashish Khemka

      • zzhang868
        Bbp_participant

        Has moved! Could you please have a look at these questions? Any suggestions would be appreciated!

    • mrife
      Ansys Employee

      Hi Zihan

      For the first question let me give a little backgorund first.  In distributed parallel solutions, the FEM domain is split into N domains at the element level if using N cpu cores.  Then N instances of the solver process (here MAPDL) are used to solve and each solve process takes one of the smaller domains to solve.  Each process has to communicate with the other processes that share nodes (FEM nodes, not compute cluster nodes).  If the amount of communication gets to be more than the computation that the cpu core is doing, then too many CPU cores are being used.  So for any model there will be a point where using more CPU cores will just slow down the solve.

      In the mathematics of solving FEA equations we use degrees of freedom [DOF] instead of number of nodes/elements.  Let's say this is a FEM with only solid structural elements with translational DOFs in X, Y and Z.  Then the total number of DOFs is 3 * number of nodes.

      In order to know how many cpu cores to use we need to know the range of DOFs per cpu core where the cpu core is most efficient.  When working with a new CPU that I've not tested before I assume something like 30,000 DOFs per cpu core and figure out how many cpus and compute nodes are needed from there.  Ignoring any question on network communication speed between the compute nodes (for now).

      So for your model I'd try 170 cpu cores or 7 compute nodes as a test.  Then maybe try 6 and 8 compute nodes (fully used) to see how hardware responds to a little more and less dof's per cpu core.  Depending on what happens you may then want to run other tests.

      For now I'd not use the GPU as a solver accelerator.  Since the GPU is helping all of the processes one GPU for 24 cpu cores is a bit much...I'd prefer to see 2 gpus for those 24 cores.  Trying to keep to about 1 gpu per 12 or so cpu cores (and hence number of solver processes).

      I'm also assuming the use of the sparse (direct) solver.  If using the iterative solver then the dof's per cpu core will probably be different for that cluster.

       

    • zzhang868
      Bbp_participant

      Thank you so much for your feedback!

Viewing 4 reply threads
  • You must be logged in to reply to this topic.