Ansys Products

Ansys Products

Discuss installation & licensing of our Ansys Teaching and Research products.

Ansys Fluent – Error with MPI

    • ae22b001
      Subscriber

      Hi, I'm trying to run a simulation using ansys fluent 2023 R1 on my university HPCE but ran into some trouble.

      This is what i used in the job file - "fluent 2ddp -g -mpi=intel -t10 -cnf=$PBS_NODEFILE -i trial1journal.jou > fluentdata.out".
      I am getting some errors mostly cause by MPI but I'm not able to figure out what the issue is.
      Part of the error file - 

      This probably means that Tcl wasn't installed properly.
       
      application-specific initialization failed: Can't find a usable init.tcl in the following directories: 
          /home2/polyflowbuilds/buildagents/milpolybld01/work/1210bc2a0080eb69/Tcl-tk/8_5_11/linx64/Release/tcltk/lib/tcl8.5 /usr/lib/tcl8.5 /lib/tcl8.5 /usr/library /library /tcl8.5.11/library /tcl8.5.11/library
       
       
       
      This probably means that Tcl wasn't installed properly.
       
      *** Error in `/lfs/sware/ansys2023r1/ansys_inc/v231/fluent/fluent23.1.0/lnamd64/2ddp_node/fluent_mpi.23.1.0': free(): invalid next size (fast): 0x00000000065bcca0 ***
      *** Error in `/lfs/sware/ansys2023r1/ansys_inc/v231/fluent/fluent23.1.0/lnamd64/2ddp_node/fluent_mpi.23.1.0': free(): invalid next size (fast): 0x000000000738aca0 ***
      ======= Backtrace: =========
      /lib64/libc.so.6(+0x81489)[0x2ab7ff78b489]
      /lfs/sware/ansys2023r1/ansys_inc/v231/fluent/fluent23.1.0/multiport/lnamd64/mpi/shared/libmport.so(+0x87760)[0x2ab7eaa3c760]
      /lfs/sware/ansys2023r1/ansys_inc/v231/fluent/fluent23.1.0/multiport/lnamd64/mpi/shared/libmport.so(vmfree+0xe1)[0x2ab7eaa3ad58]
      /lfs/sware/ansys2023r1/ansys_inc/v231/fluent/fluent23.1.0/lnamd64/2ddp_node/fluent_mpi.23.1.0(CX_Free_MU_Unsafe+0x181)[0x28f2fa1]
      /lfs/sware/ansys2023r1/ansys_inc/v231/fluent/fluent23.1.0/lnamd64/2ddp_node/fluent_mpi.23.1.0[0x1f6773c]

      Another error file:
      myid (5): Fatal signal raised sig = SIGIOT 
       /lfs/sware/ansys2023r1/ansys_inc/v231/fluent/fluent23.1.0/lnamd64/2ddp_node/fluent_mpi.23.1.0() [0x2927f0c]
       /lib64/libpthread.so.0(+0xf5d0) [0x2ab7edade5d0]
       /lib64/libc.so.6(gsignal+0x37) [0x2ab7ff740207]
       /lib64/libc.so.6(abort+0x148) [0x2ab7ff7418f8]
       /lib64/libc.so.6(+0x78d27) [0x2ab7ff782d27]

      And the transcript file:
      Building...
           mesh
      auto partitioning mesh by Metis (fast
      ===============Message from the Cortex Process================================
       
      Fatal error in one of the compute processes.
       
      ==============================================================================

      Can you please help me with solving this? I'm trying to simulate using 1 node and 10 cores.
    • ae22b001
      Subscriber

      I think this error is because of auto-partitioning, I've tried another run by creating the case file on my windows pc with the same number of cores that I'm going to be using on the cluster and it worked.

      So the problem is definitely due to autopartitioning 

    • RK
      Ansys Employee

       

      Hi Ashwin, 

      Can you please try with -mpi = intel 2018?

      What is the size of the mesh you are loading and how many cores? Is it model specific or are you able to replicate this error with any large models? 

       

      • ae22b001
        Subscriber

        I will try that out now.
        The mesh has 250k elements, while making the case file on my windows PC i used 5 cores and on the cluster i used more (hence the auto partitioning).
        In another trial i used same number of nodes on my PC and the cluster and it worked . (Also auto partition works on my PC but not on the cluster )

         

        • ae22b001
          Subscriber

          -mpi = intel 2019 also did not work

           

        • ae22b001
          Subscriber

          -mpi = intel 2018 also did not work

           

    • ae22b001
      Subscriber

      Hi, any support regarding this?
      Im getting similar errors like this:

      Building...
           mesh
      distributing mesh
      parts.....,
      faces.....,
      nodes.....,
      cells.....,
              bandwidth reduction using Reverse Cuthill-McKee: 39635/196 = 202.219
      ===============Message from the Cortex Process================================
       
      Fatal error in one of the compute processes.
       
      ==============================================================================
    • RK
      Ansys Employee

      I would suggest launching Fluent meshing on the cluster, load the case and then use the switch to solution option to the solver. 

Viewing 4 reply threads
  • The topic ‘Ansys Fluent – Error with MPI’ is closed to new replies.