Photonics

Photonics

Topics related to Lumerical and more.

Error when running a topology optimization on a cluster (Linux)

    • samupekkaojanen
      Subscriber

      We have installed Lumerical on a cluster, and I have used it to perform some more computationally expensive simulations. I use the following to run some scripts, and it works perfectly

      time xvfb-run fdtd-solutions -nw -run somescript.lsf

      Additionally I am able to run python codes using the API with the following 

      time xvfb-run "/home/opt/Lumerical/2021-r1/python-3.6.8rc1/bin/python" somepythonscript.py

      where test.py sets up an FDTD simulation, runs it, and saves the result. This also works fine!

      However, recently I have tried to run some topology optimizations. I can run them perfectly on my laptop, but when I run the same file in the cluster, I get the following error:

      srun: job 26320520 has been allocated resources
      CONFIGURATION FILE {'root': '/home/opt/Lumerical/2021-r1/api/python', 'lumapi': '/home/opt/Lumerical/2021-r1/api/python'}
      Initializing super optimization
      Wavelength range of source object will be superseded by the global settings.
      Wavelength range of source object will be superseded by the global settings.
      Wavelength range of source object will be superseded by the global settings.
      Wavelength range of source object will be superseded by the global settings.
      Wavelength range of source object will be superseded by the global settings.
      Making adjoint solves
      Traceback (most recent call last):
        File "topology_runsim.py", line 82, in
          runSim(params, eps_bg, eps_wg, x_pos, y_pos, size_x*1e-9, size_y*1e-9, filter_R, working_dir=working_dir, beta=1)
        File "topology_runsim.py", line 61, in runSim
          opt.run(working_dir = working_dir)
        File "/home/opt/Lumerical/2021-r1/api/python/lumopt/optimization.py", line 351, in run
          self.initialize(working_dir=working_dir)
        File "/home/opt/Lumerical/2021-r1/api/python/lumopt/optimization.py", line 247, in initialize
          plotting_function=plotting_function)
        File "/home/opt/Lumerical/2021-r1/api/python/lumopt/optimizers/optimizer.py", line 82, in initialize
          self.reset_start_params(start_params, self.scale_initial_gradient_to)
        File "/home/opt/Lumerical/2021-r1/api/python/lumopt/optimizers/optimizer.py", line 90, in reset_start_params
          self.auto_detect_scaling(scale_initial_gradient_to)
        File "/home/opt/Lumerical/2021-r1/api/python/lumopt/optimizers/optimizer.py", line 97, in auto_detect_scaling
          gradients = self.callable_jac(params)
        File "/home/opt/Lumerical/2021-r1/api/python/lumopt/optimizers/minimizer.py", line 34, in callable_jac_local
          fom_gradients = callable_jac(params_over_scaling_factor) / self.scaling_factor
        File "/home/opt/Lumerical/2021-r1/api/python/lumopt/optimization.py", line 170, in callable_jac
          nested_job_list = pool.map(func = make_adjoint_solves, iterable = enumerate(self.optimizations))
        File "/home/opt/Lumerical/2021-r1/python-3.6.8rc1/lib/python3.6/multiprocessing/pool.py", line 266, in map
          return self._map_async(func, iterable, mapstar, chunksize).get()
        File "/home/opt/Lumerical/2021-r1/python-3.6.8rc1/lib/python3.6/multiprocessing/pool.py", line 644, in get
          raise self._value
        File "/home/opt/Lumerical/2021-r1/python-3.6.8rc1/lib/python3.6/multiprocessing/pool.py", line 119, in worker
          result = (True, func(*args, **kwds))
        File "/home/opt/Lumerical/2021-r1/python-3.6.8rc1/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
          return list(map(*args))
        File "/home/opt/Lumerical/2021-r1/api/python/lumopt/optimization.py", line 163, in make_adjoint_solves
          forward_job_name = optimization.make_forward_sim(params, iter)
        File "/home/opt/Lumerical/2021-r1/api/python/lumopt/optimization.py", line 589, in make_forward_sim
          self.geometry.add_geo(self.sim, params = None, only_update = True)
        File "/home/opt/Lumerical/2021-r1/api/python/lumopt/geometries/topology.py", line 391, in add_geo
          fdtd.eval(script)
        File "/home/opt/Lumerical/2021-r1/api/python/lumapi.py", line 1184, in eval
          evalScript(self.handle, code)
        File "/home/opt/Lumerical/2021-r1/api/python/lumapi.py", line 248, in evalScript
          raise LumApiError("Failed to evaluate code")
      lumapi.LumApiError: 'Failed to evaluate code'

      I have not been able to figure out what is the problem. This is how I run the code:

      time xvfb-run "/home/opt/Lumerical/2021-r1/python-3.6.8rc1/bin/python" topology_runsim.py

      Any idea what could be causing the error?

    • Lito
      Ansys Employee

      Hi samupekkaojanen,

      Based on the information provided, you are running the scripts without GUI on Linux.

      >>>Running CAD jobs on headless Linux systems – Ansys Optics 

       

      If you are able to run: 

      xvfb-run "/home/opt/Lumerical/2021-r1/python-3.6.8rc1/bin/python" somepythonscript.py

      But you are having issues running a different lumapi/lumopt script:

      xvfb-run "/home/opt/Lumerical/2021-r1/python-3.6.8rc1/bin/python" topology_runsim.py

      Then the issue is with your script, "topology_runsim.py".

      Try to saving the script below (lumtest.py): 

      ## save as lumtest.py ##
      import lumapi
      fdtd = lumapi.FDTD()
      fdtd.addfdtd()
      fdtd.addring()
      fdtd.addmesh()
      fdtd.save("lumtestpy.fsp")

      Run the above script: 

      xvfb-run "/home/opt/Lumerical/2021-r1/python-3.6.8rc1/bin/python" /path_to_script/lumtest.py

      It should create the "lumtestpy.fsp" simulation file on the same working directory as script. 

    • samupekkaojanen
      Subscriber

       

      Hi Lito,

      Thanks for the reply.

      ”you are running the scripts without GUI on Linux.”

      Yes, exactly! Like I mentioned, “topology_runsim.py” works when I run it on my laptop (Windows 10 with GUI)

      Also, I tried the script you sent with xvfb-run "/home/opt/Lumerical/2021-r1/python-3.6.8rc1/bin/python" lumtest.py, and it works fine. “lumtestpy.fsp” is created in the working directory as expected.

      -Samu

       

    • Greg Baethge
      Ansys Employee

       

      Hi Samu,

      It’s always tricky to troubleshoot these issues on headless nodes. The Python error indicates it is unable to run a script via the API:

       

      script=('select("import");'
            'delete;'
            'addimport;'
            'temp=zeros(length(x_geo),length(y_geo),2);'
            'temp(:,:,1)=eps_geo;'
            'temp(:,:,2)=eps_geo;'
            'importnk2(sqrt(temp),x_geo,y_geo,z_geo);')
      fdtd.eval(script)

       

      The error is on the last line. Typically, this is either due to an error in the script string, or a problem with the FDTD session opened via the API (for instance if FDTD UI crash, fdtd.eval would fail with the same error).

       Another test you can try is something like:

       

      import lumapi
      fdtd = lumapi.FDTD()
      script = ('addfdtd;'
               'addring;'
               'addmesh;')
      fdtd.eval(script)
      fdtd.save("lumtestpy.fsp")

       

      Additionally, you’re using a fairly old version of Lumerical. Can you try with a more recent one?

       

    • samupekkaojanen
      Subscriber

      Hi Greg,

      Tried the test you suggested, and it succeeded, so there seems to be no problem with the "fdtd.eval(script)" command itself.

      One thing that comes to my mind is that inverse design plots the result during optimization. I wonder if that can cause an error, since there is no GUI? Is it possible to disable the plotting of results during inverse design?

      Are you aware if anyone else has tried to run inverse design on a server with no GUI, and if they have run into any issues?

      I have also requested that we install new version of Lumerical to the server, but I am not sure if they are willing to do that anytime soon

    • Greg Baethge
      Ansys Employee

      Hi Samu,

      Thanks for the update. I think you can disable the plot by setting plot_history to false in the Optimization object (see Getting Started with lumopt). That said, I don't think it should be an issue as you're running the optimization using a virtual display (Xvfb).

      Are you running an optimization you got from one of our examples or is it something you set up/modified?

      Depending on how much space you have on your home directory, you could try and install Lumerical locally, to run some tests. The process is explained here. Note this won't take care of any dependency, so it will only work if all the required libraries are already installed.

    • samupekkaojanen
      Subscriber

      We installed Lumerical v241 on the cluster. I am actually now running into the following error when running lumtest.py on the cluster with the new version:

      Traceback (most recent call last):
        File "/home/ojanen4/lumtest.py", line 12, in
          fdtd = lumapi.FDTD()
        File "/home/opt/Lumerical/v241/api/python/lumapi.py", line 1541, in __init__
          super(FDTD, self).__init__('fdtd', filename, key, hide, serverArgs, remoteArgs, **kwargs)
        File "/home/opt/Lumerical/v241/api/python/lumapi.py", line 1196, in __init__
          handle = self.__open__(iapi, product, key, hide, serverArgs, remoteArgs)
        File "/home/opt/Lumerical/v241/api/python/lumapi.py", line 1415, in __open__
          raise LumApiError(error)
      lumapi.LumApiError: 'Exception [::appOpened]: Session not found'

      This occurs at line "fdtd = lumapi.FDTD()". This worked fine in the older version. What could be causing this?

    • Lito
      Ansys Employee

      @samupekkaojanen,

      The error below, indicates that the API was not able to open/launch the Lumerical FDTD CAD/GUI, and could not find the CAD/GUI session. 

      lumapi.LumApiError: 'Exception [::appOpened]: Session not found'

      Are you running CAD jobs (scripts/API) on a headless cluster? You can set the environment variable below on the 2023R2 and newer releases, prior to running the API/script, if you are not using virtual display (Xvfb); 

      export QT_QPA_PLATFORM=offscreen

      Then set lumapi to "hide" the CAD/GUI session:

      lumapi.FDTD(hide=True)

      See our KB for details: > Running CAD jobs on headless Linux systems – Ansys Optics

      Hope this helps. 

    • samupekkaojanen
      Subscriber

      Hi Lito,

      I am using xvfb, but tried also with your suggestion. I still get the same error.

    • samupekkaojanen
      Subscriber

      I now tried to run a basic lsf script (simple.lsf) in FDTD with this command:

      time xvfb-run fdtd-solutions -nw -run simple.lsf

      But even this now fails. I get the following error:

      /home/opt/Lumerical/v241/bin/fdtd-solutions-app: error while loading shared libraries: libglut.so.3: cannot open shared object file: No such file or directory

      This worked in the older version on the cluster, but with this newly installed version (v241), these errors appear.

    • Lito
      Ansys Employee

      @samupekkaojanen,

      Your cluster is missing the package that provides the "liblgut3" libraries. Please ask your IT/cluster admin to install the missing packages/libraries required by Lumerical on your cluster. 
      >>Required libraries for Ansys Lumerical on Linux – Ansys Optics

    • samupekkaojanen
      Subscriber

       

      Hi Lito,

      Thank you for the suggestion! The problem with libraries is now fixed, but I’m still running into issues when trying to run a test .lsf script. I’m getting the following error (I’ve hidden the license server).

      Firstly I get this error:

      Messages file /ansys_inc/shared_files/licensing/language/en-us/ansysli_msgs.xml does not exist.

       

      And secondly this license error: 

      Feature:       lumerical_gui

      License path: 1055@****.**

      FlexNet Licensing error:-5,147

      Error: Failed to checkout feature ‘lumerical_gui'
      No such feature exists.
      Feature:       lumerical_gui
      License path: 1055@****.**
       Licensing error:-5,147

      Would you like to reconfigure your license settings?, Response: No

      The license server and port should be correct, so I am not sure what is the problem here.

      Would you have any suggestions on how to fix this?

       

    • Lito
      Ansys Employee

      @samupekkaojanen,

      You can ignore the first message/notice about the XML file. This will not prevent you from checking out the license. 

      The second error indicates that there is no "lumerical_gui" (enterprise license) on the server. Set your Optics Launcher > license configuration (GUI) to obtain the license from the server on the port used by the Ansys license manager (consult IT/license admins for this information).
      >> Lumerical license configuration with the Ansys Optics Launcher – Ansys Optics 

      Otherwise, if you have the correct license server information, configure the "License.ini" file to checkout the "Standard" licenses from your server.  
      >>Lumerical license configuration from the command line – Ansys Optics

    • samupekkaojanen
      Subscriber

      Hi Lito,

      Thank you. The license error is now resolved.

      However, I am now trying to run the topology optimization again, and am running into the exact same error as in the previous version of Lumerical. I am running the following commands:

      module load lumerical/v241
      cd $SLURM_SUBMIT_DIR
      export QT_QPA_PLATFORM=offscreen  
      time "$LUMERICAL_ROOT/python/bin/python3" demux_runsim.py

      I am using the example .py file from here: https://optics.ansys.com/hc/en-us/articles/1500007188582-Topology-Optimization-of-a-4-channel-wavelength-demultiplexer-2D-TE

      And I get the following error: 

      CONFIGURATION FILE {'root': '/home/opt/Lumerical/v241/api/python', 'lumapi': '/home/opt/Lumerical/v241/api/python'}
      Initializing super optimization
      Traceback (most recent call last):
        File "/home/ojanen4/demux_runsim.py", line 82, in
          runSim(params, eps_bg, eps_wg, x_pos, y_pos, size_x*1e-9, size_y*1e-9, filter_R, working_dir=working_dir, beta=1)
        File "/home/ojanen4/demux_runsim.py", line 61, in runSim
          opt.run(working_dir = working_dir)
        File "/home/opt/Lumerical/v241/api/python/lumopt/optimization.py", line 460, in run
          self.initialize(working_dir=working_dir)
        File "/home/opt/Lumerical/v241/api/python/lumopt/optimization.py", line 119, in initialize
          self.one_forward = check_one_forward_sim(self.optimizations[0])
        File "/home/opt/Lumerical/v241/api/python/lumopt/optimization.py", line 82, in check_one_forward_sim
          co_opt.sim = Simulation(self.workingDir, co_opt.use_var_fdtd, co_opt.hide_fdtd_cad)
        File "/home/opt/Lumerical/v241/api/python/lumopt/utilities/simulation.py", line 22, in __init__
          self.fdtd = lumapi.MODE(hide = hide_fdtd_cad) if use_var_fdtd else lumapi.FDTD(hide = hide_fdtd_cad)
        File "/home/opt/Lumerical/v241/api/python/lumapi.py", line 1541, in __init__
          super(FDTD, self).__init__('fdtd', filename, key, hide, serverArgs, remoteArgs, **kwargs)
        File "/home/opt/Lumerical/v241/api/python/lumapi.py", line 1196, in __init__
          handle = self.__open__(iapi, product, key, hide, serverArgs, remoteArgs)
        File "/home/opt/Lumerical/v241/api/python/lumapi.py", line 1415, in __open__
          raise LumApiError(error)
      lumapi.LumApiError: 'Exception [::appOpened]: Session not found'
      Exception ignored in:
      Traceback (most recent call last):
        File "/home/opt/Lumerical/v241/api/python/lumopt/utilities/simulation.py", line 67, in __del__
          self.fdtd.close()
      AttributeError: 'Simulation' object has no attribute 'fdtd'

    • samupekkaojanen
      Subscriber

      I also tried with xvfb (time xvfb-run "$LUMERICAL_ROOT/python/bin/python3" demux_runsim.py), but it does not work either. The code gets slightly farther, but then crashes with the following error:

      CONFIGURATION FILE {'root': '/home/opt/Lumerical/v241/api/python', 'lumapi': '/home/opt/Lumerical/v241/api/python'}
      Initializing super optimization
      Checking for one forward simulation :   More than One Wavelengths range, one forward simulation is not possible
      Wavelength range of source object will be superseded by the global settings.
      Traceback (most recent call last):
        File "/home/ojanen4/demux_runsim.py", line 82, in
          runSim(params, eps_bg, eps_wg, x_pos, y_pos, size_x*1e-9, size_y*1e-9, filter_R, working_dir=working_dir, beta=1)
        File "/home/ojanen4/demux_runsim.py", line 61, in runSim
          opt.run(working_dir = working_dir)
        File "/home/opt/Lumerical/v241/api/python/lumopt/optimization.py", line 460, in run
          self.initialize(working_dir=working_dir)
        File "/home/opt/Lumerical/v241/api/python/lumopt/optimization.py", line 141, in initialize
          list(map(init_suboptimization, self.optimizations))
        File "/home/opt/Lumerical/v241/api/python/lumopt/optimization.py", line 137, in init_suboptimization
          cur_optimization.initialize(local_working_dir)
        File "/home/opt/Lumerical/v241/api/python/lumopt/optimization.py", line 669, in initialize
          self.geometry.add_geo(self.sim, start_params, only_update = False)
        File "/home/opt/Lumerical/v241/api/python/lumopt/geometries/topology.py", line 410, in add_geo
          fdtd.eval(script)
        File "/home/opt/Lumerical/v241/api/python/lumapi.py", line 1430, in eval
          evalScript(self.handle, code, True)
        File "/home/opt/Lumerical/v241/api/python/lumapi.py", line 300, in evalScript
          _evalScriptInternal(s, code)
        File "/home/opt/Lumerical/v241/api/python/lumapi.py", line 288, in _evalScriptInternal
          raise LumApiError("Failed to evaluate code")
      lumapi.LumApiError: 'Failed to evaluate code'
      QProcess: Destroyed while process ("/home/opt/Lumerical/v241/bin/fdtd-solutions") is still running.

       

    • samupekkaojanen
      Subscriber

       

      I also updated the Lumerical on my Windows laptop, and it is now also crashing, but at a different point in the code:

      CONFIGURATION FILE {‘root’: ‘C:\\Program Files\\Lumerical\\v241\\api\\python’, ‘lumapi’: ‘C:\\Program Files\\Lumerical\\v241\\api\\python’}
      Initializing super optimization
      Checking for one forward simulation :   More than One Wavelengths range, one forward simulation is not possible
      Wavelength range of source object will be superseded by the global settings.
      Wavelength range of source object will be superseded by the global settings.
      Wavelength range of source object will be superseded by the global settings.
      Wavelength range of source object will be superseded by the global settings.
      Wavelength range of source object will be superseded by the global settings.
      Making adjoint solves
      Running solves
      Processing adjoint solves
      FOM = 0.975600292420355 (1 – 0.024399707579645047)
      FOM = 0.9134124921948557 (1 – 0.08658750780514435)
      FOM = 0.9635248911794267 (1 – 0.036475108820573254)
      FOM = 0.0029434785871421676
      FOM = 0.9126453599424964 (1 – 0.0873546400575036)
      Traceback (most recent call last):
        File “C:\Users\ojanen4\Documents\cluter testing\demux_runsim.py”, line 82, in
          runSim(params, eps_bg, eps_wg, x_pos, y_pos, size_x*1e-9, size_y*1e-9, filter_R, working_dir=working_dir, beta=1)
        File “C:\Users\ojanen4\Documents\cluter testing\demux_runsim.py”, line 61, in runSim
          opt.run(working_dir = working_dir)
        File “C:\Program Files\Lumerical\v241\api\python\lumopt\optimization.py”, line 460, in run
          self.initialize(working_dir=working_dir)
        File “C:\Program Files\Lumerical\v241\api\python\lumopt\optimization.py”, line 350, in initialize
          self.optimizer.initialize(start_params=start_params,
        File “C:\Program Files\Lumerical\v241\api\python\lumopt\optimizers\optimizer.py”, line 100, in initialize
          self.reset_start_params(start_params, self.scale_initial_gradient_to)
        File “C:\Program Files\Lumerical\v241\api\python\lumopt\optimizers\optimizer.py”, line 108, in reset_start_params
          self.auto_detect_scaling(scale_initial_gradient_to)
        File “C:\Program Files\Lumerical\v241\api\python\lumopt\optimizers\optimizer.py”, line 115, in auto_detect_scaling
          gradients = self.callable_jac(params)
        File “C:\Program Files\Lumerical\v241\api\python\lumopt\optimizers\minimizer.py”, line 34, in callable_jac_local
          fom_gradients = callable_jac(params_over_scaling_factor) / self.scaling_factor
        File “C:\Program Files\Lumerical\v241\api\python\lumopt\optimization.py”, line 297, in callable_jac
          jac_list = pool.map(process_adjoint_solves, enumerate(self.optimizations))
        File “C:\Program Files\Lumerical\v241\python\lib\multiprocessing\pool.py”, line 364, in map
          return self._map_async(func, iterable, mapstar, chunksize).get()
        File “C:\Program Files\Lumerical\v241\python\lib\multiprocessing\pool.py”, line 771, in get
          raise self._value
        File “C:\Program Files\Lumerical\v241\python\lib\multiprocessing\pool.py”, line 125, in worker
          result = (True, func(*args, **kwds))
        File “C:\Program Files\Lumerical\v241\python\lib\multiprocessing\pool.py”, line 48, in mapstar
          return list(map(*args))
        File “C:\Program Files\Lumerical\v241\api\python\lumopt\optimization.py”, line 290, in process_adjoint_solves
          jac = optimization.calculate_gradients()
        File “C:\Program Files\Lumerical\v241\api\python\lumopt\optimization.py”, line 910, in calculate_gradients
          grad_name = self.geometry.calculate_gradients_on_cad(self.sim, ‘forward_fields’, ‘adjoint_fields’, self.scaling_factor)
        File “C:\Program Files\Lumerical\v241\api\python\lumopt\geometries\topology.py”, line 290, in calculate_gradients_on_cad
          sim.fdtd.eval((‘params = struct;'
        File “C:\Program Files\Lumerical\v241\api\python\lumapi.py”, line 1430, in eval
          evalScript(self.handle, code, True)
        File “C:\Program Files\Lumerical\v241\api\python\lumapi.py”, line 300, in evalScript
          _evalScriptInternal(s, code)
        File “C:\Program Files\Lumerical\v241\api\python\lumapi.py”, line 288, in _evalScriptInternal
          raise LumApiError(“Failed to evaluate code”)
      lumapi.LumApiError: ‘Failed to evaluate code'

      Here crashing seems to occur at this line of code:

      sim.fdtd.eval(('params = struct;'
                             'params.eps_levels=[{0},{1}];'
                             'params.filter_radius = {2};'
                             'params.beta = {3};'
                             'params.eta = {4};'
                             'params.dx = {5};'
                             'params.dy = {6};'
                             'params.dz = 0.0;'
                             'dF_dp = topoparamstogradient(params,topo_rho,dF_dEps);').format(self.eps_min,self.eps_max,self.filter_R,self.beta,self.eta,self.dx,self.dy) )

       

    • samupekkaojanen
      Subscriber

      Quick update: Shape optimization works fine both on my Windows laptop and on the headless Linux server. I found out that using xvfb-run to run Python scripts leads to crashing. When I run the shape optimization like this, it is working perfectly:

      export QT_QPA_PLATFORM=offscreen  
      time "$LUMERICAL_ROOT/python/bin/python3" splitter_with_arms.py

      Topology optimization still doesn't work

Viewing 16 reply threads
  • The topic ‘Error when running a topology optimization on a cluster (Linux)’ is closed to new replies.