Ansys Assistant will be unavailable on the Learning Forum starting January 30. An upgraded version is coming soon. We apologize for any inconvenience and appreciate your patience. Stay tuned for updates.
Ansys Free Student Software

Ansys Free Student Software

Topics cover installation and configuration of our free student products.

HELP: Fluent freezes on HPC Cluster

    • mahereid97
      Subscriber

      Hello,


      I am trying to simulate a wind turbine in Fluent using HPC on a Cluster. After solving several time steps (less than 20 usually), the calculations stop, and the console only prints the following message without starting the calculation of the next time step. (Screenshot attached)


      "Updating solution at time levels N and N-1 using variable time stepping method.


      Done."


      Also all CPUs are still working 100%, yet I have no idea what they are calculating. 


      The Cluster works fine with other cases and problems meaning it is well set, and this specific case works normally on a regular PC so the setup must be correct.


      Any suggestion would be great


      Thanks!

    • Rob
      Forum Moderator

      Can you check the solver isn't trying to write output files to somewhere that is full or doesn't exist?  

    • mahereid97
      Subscriber
      Autosave Case and Data is not due on the time step at which fluent freezes. Is there any writing that occurs between each time step? Anyways my project is saved in a folder with plenty of space, and other cases work fine.
    • Rob
      Forum Moderator

      Only if you've set monitors or image export. Can you also check RAM usage, and whether DPM updates or the like are triggered.

      If everything else works then it's likely that model, so you'll need to work through mesh and settings to see what the cause is. 

    • mahereid97
      Subscriber
      The only monitor I am saving is the Moment coefficient at each time step. RAM is not the problem because there's plenty, and the only relevant model is the LES for turbulence. (I tried k-omega but same problem) To give you some additional details about the case: -mesh has around 33m elements -periodic boundaries, sliding mesh, and symmetry involved. -cluster is 7 nodes having 16 cores each -time step is around 6e-5 sec. What's weird is that the all the compute cores are 100% working, while the main management(host) cores drop from about 10% to 4-6% when the problem occurs.
    • Rob
      Forum Moderator

      Can you run with mesh motion off?  We need to work through to see what's getting stuck, and if nothing else is changing it's a good starting point. 

    • mahereid97
      Subscriber
      I switched to mrf instead of sliding mesh and the problem is gone (also "updating solution at time levels N and N-1" is instantaneous) But I still need to use the sliding mesh for my simulations.
    • Rob
      Forum Moderator

      Right, so if the model sticks at a set time (not necessarily number of timesteps) what changes at that point?

    • mahereid97
      Subscriber
      What do you mean by that?
    • Rob
      Forum Moderator

      If a model runs well and hasn't diverged but then "sticks" it's often either the hardware or saving (which you've checked) or something changes within the case. If mrf works it suggests there's a problem with sliding mesh. So, what happens with the mesh at the point the model goes wrong?  

    • mahereid97
      Subscriber
      Oh, when fluent freezes I cannot check the mesh or case because the calculation can't be stopped unless I kill the processes which leads to fluent closing suddenly. Does it help if I let fluent display the mesh at each time step ? And do I need to display or monitor additional stuff at each time step to find the problem ?
    • Rob
      Forum Moderator

      Or run to just before it fails and have a look. In the File menu there's also an option to Write Transcript. That should dump out everything that's written to the TUI and can be quite helpful. 

    • mahereid97
      Subscriber
      The mesh looks normal and as it should be at the given time step. What am I looking for exactly? And it doesn't look like a normal fluent crash or failure because something is still calculating between the time steps. I guess the mesh is being rotated during this time, so can it be a partition problem, in which the host cpu fails to assign moving nodes to cpu cores? (Keep in mind that the same case does not have this problem on a 16cores single machine)
    • Felix_unsw
      Subscriber
      .

      Hi, have you solved this issue? I currently got a similar issue. I would be much appreciated it if you could give me some advice.

      .
Viewing 13 reply threads
  • The topic ‘HELP: Fluent freezes on HPC Cluster’ is closed to new replies.
[bingo_chatbox]