Ansys Assistant will be unavailable on the Learning Forum starting January 30. An upgraded version is coming soon. We apologize for any inconvenience and appreciate your patience. Stay tuned for updates.
Electronics

Electronics

Topics related to HFSS, Maxwell, SIwave, Icepak, Electronics Enterprise and more.

Need help executing Ansys Maxwell batchsolve on a HPC cluster

    • kxp385
      Subscriber

      I am trying to run my Optimetrics design setup(number of combinations~3000) on a HPC cluster which uses SLURM scheduler to schedule jobs. I reserved a node with 100 cores on the cluster, generated a list of machine names and ran following command.

      Command:

      ansysedt -ng -distributed -machinelist list=$NODELIST -batchsolve Maxwell3DDesign_xx:Optimetrics "/rds/projects/xx/src/Ansys/project1_xx.aedt"

      Issues when running jobs with above command:
      1. No matter how long I reserve the nodes for, there are always combinations in the optimetrics setup that dont get finished. This problems persists until there are more cores available than the optimetrics combinations to be solved. Until then I have to keep deleting the lock file and rerun the job.
      2. No acceleration even if multiple nodes are used. Only the cores on a single node are utilized in batchsolve (confirmed with the optimetrics report).
       
      Questions:
      1. Is there a problem in the command I am running to solve optimetrics setup?
      2. Do I need to setup any HPC related settings inside the Ansys project itself? Currently not setting anything.
      3. Do I need to load some sort of MPI module on my HPC cluster to be able to run above command on multiple nodes?
      4. Any other obvious way to accelerate the batchsolve that I am missing?
       
    • randyk
      Forum Moderator

      Hi kxp385,

      Please consider this example script:

      #!/bin/bash
      #SBATCH -N 2               # allocate 2 nodes
      #SBATCH -n 12              # 12 tasks total
      #SBATCH -J AnsysEMTest     # sensible name for the job
      #SBATCH -p default           # partition name
      #SBATCH --output=multiprocess_%j.log # Standard output and error log
      #SBATCH --exclusive        # no other jobs on the nodes while job is running
       
      # Project Name and setup
      Project=project1_xx.aedt
      AnalysisSetup="Maxwell3DDesign_xx:Optimetrics"
      JobFolder=$(pwd)
      #### Do not modify any items below this line unless requested ####
      InstFolder=/path/v252/AnsysEM
       
      # SLURM
      export ANSYSEM_GENERIC_MPI_WRAPPER=${InstFolder}/schedulers/scripts/utils/slurm_srun_wrapper.sh
      export ANSYSEM_COMMON_PREFIX=${InstFolder}/common
      srun_cmd="srun --overcommit --export=ALL  -n 1 -N 1 --cpu-bind=none --mem-per-cpu=0 --overlap "
      # Autocompute total cores from node allocation
      export ANSYSEM_TASKS_PER_NODE="${SLURM_TASKS_PER_NODE}"
       
      # skip OS/Dependency check
      export ANS_IGNOREOS=1
      export ANS_NODEPCHECK=1
       
      # MPI timeout set to 30min default for cloud suggest lower to 120 or 240 seconds for onprem
      export MPI_TIMEOUT_SECONDS=120
       
      # Submit AEDT Job (SLURM)
      ${srun_cmd} ${InstFolder}/ansysedt -ng -monitor -waitforlicense -useelectronicsppe=1 -distributed -machinelist numcores=$SLURM_NTASKS -auto -batchoptions " 'Maxwell 3D/MPIVendor'='Intel''Maxwell 3D/MPIVersion'='2021''Maxwell 3D/RemoteSpawnCommand'='scheduler'" -batchsolve ${AnalysisSetup} ${JobFolder}/${Project}

      Then submit:
      $ chmod +x

      [bingo_chatbox]