We have an exciting announcement about badges coming in May 2025. Until then, we will temporarily stop issuing new badges for course completions and certifications. However, all completions will be recorded and fulfilled after May 2025.

Ansys Learning Forum Forums Discuss Simulation Fluids Fluent fails with Intel MPI protocol on 2 nodes Reply To: Fluent fails with Intel MPI protocol on 2 nodes

dv.makarov
Subscriber

Morning George, thank you for replying.

  1. Tested both Fluent 2023R1 and 2024R1.
  2. Following a similar discussion on this Forum we tried to set environmental variable "I_MPI_PLATFORM" to "none". Starting Fluent from the same terminal (either in text or graphic interface) resulted in Fluent output "I_MPI_PLATFORM=auto" anyway.
  3.  Both Fluent versions crashed with the same output:

--------------------------------------------------------------------------------------------------------------------

Opening input/output transcript to file "/users/ssivaraman/fluent-20241022-100917-1713090.trn".

Auto-Transcript Start Time:  10:09:17, 22 Oct 2024

/opt/apps/ansys/v241/fluent/fluent24.1.0/bin/fluent -r24.1.0 3ddp -host -t256 -cnf=node[187-188] -path/opt/apps/ansys/v241/fluent -ssh -cx node187.pri.kelvin2.alces.network:43609:33349

Starting /opt/apps/ansys/v241/fluent/fluent24.1.0/lnamd64/3ddp_host/fluent.24.1.0 host -cx node187.pri.kelvin2.alces.network:43609:33349 "(list (rpsetvar (QUOTE parallel/function) "fluent 3ddp -flux -node -r24.1.0 -t256 -pdefault -mpi=intel -cnf=node[187-188] -ssh") (rpsetvar (QUOTE parallel/rhost) "") (rpsetvar (QUOTE parallel/ruser) "") (rpsetvar (QUOTE parallel/nprocs_string) "256") (rpsetvar (QUOTE parallel/auto-spawn?) #t) (rpsetvar (QUOTE parallel/trace-level) 0) (rpsetvar (QUOTE parallel/remote-shell) 1) (rpsetvar (QUOTE parallel/path) "/opt/apps/ansys/v241/fluent") (rpsetvar (QUOTE parallel/hostsfile) "node[187-188]") (rpsetvar (QUOTE gpuapp/devices) ""))"

              Welcome to ANSYS Fluent 2024 R1

 

              Copyright 1987-2024 ANSYS, Inc. All Rights Reserved.

              Unauthorized use, distribution or duplication is prohibited.

              This product is subject to U.S. laws governing export and re-export.

              For full Legal Notice, see documentation.

 

Build Time: Nov 22 2023 10:07:25 EST  Build Id: 10184 

 

Connected License Server List:  1055@193.61.145.219

 

     --------------------------------------------------------------

     This is an academic version of ANSYS FLUENT. Usage of this product

     license is limited to the terms and conditions specified in your ANSYS

     license form, additional terms section.

     --------------------------------------------------------------

Host spawning Node 0 on machine "node187.pri.kelvin2.alces.network" (unix).

/opt/apps/ansys/v241/fluent/fluent24.1.0/bin/fluent -r24.1.0 3ddp -flux -node -t256 -pdefault -mpi=intel -cnf=node[187-188] -ssh -mport 10.10.15.27:10.10.15.27:41037:0

Starting /opt/apps/ansys/v241/fluent/fluent24.1.0/multiport/mpi/lnamd64/intel2021/bin/mpirun -f /tmp/fluent-appfile.ssivaraman.1713921 --rsh=ssh -genv FI_PROVIDER tcp -genv FLUENT_ARCH lnamd64 -genv I_MPI_DEBUG 0 -genv I_MPI_ADJUST_GATHERV 3 -genv I_MPI_ADJUST_ALLREDUCE 2 -genv I_MPI_PLATFORM auto -genv PYTHONHOME /opt/apps/ansys/v241/fluent/fluent24.1.0/../../commonfiles/CPython/3_10/linx64/Release/python -genv FLUENT_PROD_DIR /opt/apps/ansys/v241/fluent/fluent24.1.0 -genv FLUENT_AFFINITY 0 -genv I_MPI_PIN enable -genv KMP_AFFINITY disabled -machinefile /tmp/fluent-appfile.ssivaraman.1713921 -np 256 /opt/apps/ansys/v241/fluent/fluent24.1.0/lnamd64/3ddp_node/fluent_mpi.24.1.0 node -mpiw intel -pic default -mport 10.10.15.27:10.10.15.27:41037:0

[mpiexec@node187.pri.kelvin2.alces.network] check_exit_codes (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:117): unable to run bstrap_proxy on node[187-188] (pid 1714765, exit code 65280)

[mpiexec@node187.pri.kelvin2.alces.network] poll_for_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:159): check exit codes error

[mpiexec@node187.pri.kelvin2.alces.network] HYD_dmx_poll_wait_for_proxy_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:212): poll for event error

[mpiexec@node187.pri.kelvin2.alces.network] HYD_bstrap_setup (../../../../../src/pm/i_hydra/libhydra/bstrap/src/intel/i_hydra_bstrap.c:1061): error waiting for event

[mpiexec@node187.pri.kelvin2.alces.network] HYD_print_bstrap_setup_error_message (../../../../../src/pm/i_hydra/mpiexec/intel/i_mpiexec.c:1027): error setting up the bootstrap proxies

[mpiexec@node187.pri.kelvin2.alces.network] Possible reasons:

[mpiexec@node187.pri.kelvin2.alces.network] 1. Host is unavailable. Please check that all hosts are available.

[mpiexec@node187.pri.kelvin2.alces.network] 2. Cannot launch hydra_bstrap_proxy or it crashed on one of the hosts. Make sure hydra_bstrap_proxy is available on all hosts and it has right permissions.

[mpiexec@node187.pri.kelvin2.alces.network] 3. Firewall refused connection. Check that enough ports are allowed in the firewall and specify them with the I_MPI_PORT_RANGE variable.

[mpiexec@node187.pri.kelvin2.alces.network] 4. Ssh bootstrap cannot launch processes on remote host. Make sure that passwordless ssh connection is established across compute hosts.

[mpiexec@node187.pri.kelvin2.alces.network]    You may try using -bootstrap option to select alternative launcher.

--------------------------------------------------------------------------------------------------------------------