Licensing

Licensing

    • FAQFAQ
      Participant

      Answer: Either the bind order of interfaces or an incorrectly set MPI NETMASK is causing the issue. A typical error may look like this unable to connect to 10.0.0.12 node12 on port 52935, no endpoint matches the netmask 10.0.1.0/255.255.255.0 Note the difference in subnets Please have you cluster / network administrator review the suggestions below a) Please check how many network interfaces do the compute nodes have. If multiple interfaces then please make sure that the bind order is set correctly. b) If there is only one interface and still seeing this error, then the MPI NETMASK may need to be configured correctly for this example it will need to be set to the 10.0.0.* subnet so the command will look like cluscfg setenvs CCP_MPI_NETMASK=10.0.0.0/255.255.255.0 Additional information If using RSM to submit job to the cluster then the RSM job log may show errors like the example below Running Solver : C:Program FilesANSYS Incv192ansysbinwinx64ANSYS192.exe -b nolist -s noread -p ansys -i remote.dat -o solve.out -dis -mpi msmpi -np 12 -dir “C:/scratch/n3r39eoc.i2n” job aborted: [ranks] message [0] fatal error Fatal error in MPI_Comm_create: Other MPI error, error stack: MPI_Comm_create(MPI_COMM_WORLD, group=0x88000001, new_comm=0x000000E071458E90) failed [ch3:sock] rank 0 unable to connect to rank 8 using business card unable to connect to 10.0.0.12 node12 on port 52935, no endpoint matches the netmask 10.0.1.0/255.255.255.0 [1-11] terminated —- error analysis —– [0] on node01 mpi has detected a fatal error and aborted C:Program FilesANSYS Incv192ANSYSbinwinx64ANSYS.EXE —- error analysis —– . . . Command Exit Code: -4 ClusterJobs Exiting with code: -4 Individual Command Exit Codes are: [-4]