Error when running a topology optimization on a cluster (Linux)

- January 8, 2024 at 1:57 pm
  
  samupekkaojanen
  Subscriber
  
  We have installed Lumerical on a cluster, and I have used it to perform some more computationally expensive simulations. I use the following to run some scripts, and it works perfectly
  time xvfb-run fdtd-solutions -nw -run somescript.lsf
  Additionally I am able to run python codes using the API with the following
  time xvfb-run "/home/opt/Lumerical/2021-r1/python-3.6.8rc1/bin/python" somepythonscript.py
  where test.py sets up an FDTD simulation, runs it, and saves the result. This also works fine!
  However, recently I have tried to run some topology optimizations. I can run them perfectly on my laptop, but when I run the same file in the cluster, I get the following error:
  srun: job 26320520 has been allocated resources
  CONFIGURATION FILE {'root': '/home/opt/Lumerical/2021-r1/api/python', 'lumapi': '/home/opt/Lumerical/2021-r1/api/python'}
  Initializing super optimization
  Wavelength range of source object will be superseded by the global settings.
  Wavelength range of source object will be superseded by the global settings.
  Wavelength range of source object will be superseded by the global settings.
  Wavelength range of source object will be superseded by the global settings.
  Wavelength range of source object will be superseded by the global settings.
  Making adjoint solves
  Traceback (most recent call last):
  File "topology_runsim.py", line 82, in
  runSim(params, eps_bg, eps_wg, x_pos, y_pos, size_x*1e-9, size_y*1e-9, filter_R, working_dir=working_dir, beta=1)
  File "topology_runsim.py", line 61, in runSim
  opt.run(working_dir = working_dir)
  File "/home/opt/Lumerical/2021-r1/api/python/lumopt/optimization.py", line 351, in run
  self.initialize(working_dir=working_dir)
  File "/home/opt/Lumerical/2021-r1/api/python/lumopt/optimization.py", line 247, in initialize
  plotting_function=plotting_function)
  File "/home/opt/Lumerical/2021-r1/api/python/lumopt/optimizers/optimizer.py", line 82, in initialize
  self.reset_start_params(start_params, self.scale_initial_gradient_to)
  File "/home/opt/Lumerical/2021-r1/api/python/lumopt/optimizers/optimizer.py", line 90, in reset_start_params
  self.auto_detect_scaling(scale_initial_gradient_to)
  File "/home/opt/Lumerical/2021-r1/api/python/lumopt/optimizers/optimizer.py", line 97, in auto_detect_scaling
  gradients = self.callable_jac(params)
  File "/home/opt/Lumerical/2021-r1/api/python/lumopt/optimizers/minimizer.py", line 34, in callable_jac_local
  fom_gradients = callable_jac(params_over_scaling_factor) / self.scaling_factor
  File "/home/opt/Lumerical/2021-r1/api/python/lumopt/optimization.py", line 170, in callable_jac
  nested_job_list = pool.map(func = make_adjoint_solves, iterable = enumerate(self.optimizations))
  File "/home/opt/Lumerical/2021-r1/python-3.6.8rc1/lib/python3.6/multiprocessing/pool.py", line 266, in map
  return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/home/opt/Lumerical/2021-r1/python-3.6.8rc1/lib/python3.6/multiprocessing/pool.py", line 644, in get
  raise self._value
  File "/home/opt/Lumerical/2021-r1/python-3.6.8rc1/lib/python3.6/multiprocessing/pool.py", line 119, in worker
  result = (True, func(*args, **kwds))
  File "/home/opt/Lumerical/2021-r1/python-3.6.8rc1/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
  return list(map(*args))
  File "/home/opt/Lumerical/2021-r1/api/python/lumopt/optimization.py", line 163, in make_adjoint_solves
  forward_job_name = optimization.make_forward_sim(params, iter)
  File "/home/opt/Lumerical/2021-r1/api/python/lumopt/optimization.py", line 589, in make_forward_sim
  self.geometry.add_geo(self.sim, params = None, only_update = True)
  File "/home/opt/Lumerical/2021-r1/api/python/lumopt/geometries/topology.py", line 391, in add_geo
  fdtd.eval(script)
  File "/home/opt/Lumerical/2021-r1/api/python/lumapi.py", line 1184, in eval
  evalScript(self.handle, code)
  File "/home/opt/Lumerical/2021-r1/api/python/lumapi.py", line 248, in evalScript
  raise LumApiError("Failed to evaluate code")
  lumapi.LumApiError: 'Failed to evaluate code'
  I have not been able to figure out what is the problem. This is how I run the code:
  time xvfb-run "/home/opt/Lumerical/2021-r1/python-3.6.8rc1/bin/python" topology_runsim.py
  Any idea what could be causing the error?
- January 11, 2024 at 1:58 am
  Lito
  Ansys Employee
  Hi samupekkaojanen,
  Based on the information provided, you are running the scripts without GUI on Linux.
  >>>Running CAD jobs on headless Linux systems – Ansys Optics
  
  If you are able to run:
```
xvfb-run "/home/opt/Lumerical/2021-r1/python-3.6.8rc1/bin/python" somepythonscript.py
```
  But you are having issues running a different lumapi/lumopt script:
```
xvfb-run "/home/opt/Lumerical/2021-r1/python-3.6.8rc1/bin/python" topology_runsim.py
```
  Then the issue is with your script, "topology_runsim.py".
  Try to saving the script below (lumtest.py):
```
## save as lumtest.py ##
import lumapi
fdtd = lumapi.FDTD()
fdtd.addfdtd()
fdtd.addring()
fdtd.addmesh()
fdtd.save("lumtestpy.fsp")
```
  Run the above script:
```
xvfb-run "/home/opt/Lumerical/2021-r1/python-3.6.8rc1/bin/python" /path_to_script/lumtest.py
```
  It should create the "lumtestpy.fsp" simulation file on the same working directory as script.
- January 11, 2024 at 7:58 am
  
  samupekkaojanen
  Subscriber
  
  Hi Lito,
  Thanks for the reply.
  ”you are running the scripts without GUI on Linux.”
  Yes, exactly! Like I mentioned, “topology_runsim.py” works when I run it on my laptop (Windows 10 with GUI)
  Also, I tried the script you sent with xvfb-run "/home/opt/Lumerical/2021-r1/python-3.6.8rc1/bin/python" lumtest.py, and it works fine. “lumtestpy.fsp” is created in the working directory as expected.
  -Samu
- January 12, 2024 at 3:57 pm
  Greg Baethge
  Ansys Employee
  Hi Samu,
  It’s always tricky to troubleshoot these issues on headless nodes. The Python error indicates it is unable to run a script via the API:
```
script=('select("import");'
        'delete;'
        'addimport;'
        'temp=zeros(length(x_geo),length(y_geo),2);'
        'temp(:,:,1)=eps_geo;'
        'temp(:,:,2)=eps_geo;'
        'importnk2(sqrt(temp),x_geo,y_geo,z_geo);')
fdtd.eval(script)
```
  The error is on the last line. Typically, this is either due to an error in the script string, or a problem with the FDTD session opened via the API (for instance if FDTD UI crash, fdtd.eval would fail with the same error).
  Another test you can try is something like:
```
import lumapi
fdtd = lumapi.FDTD()
script = ('addfdtd;'
          'addring;'
          'addmesh;')
fdtd.eval(script)
fdtd.save("lumtestpy.fsp")
```
  Additionally, you’re using a fairly old version of Lumerical. Can you try with a more recent one?
- January 15, 2024 at 1:09 pm
  
  samupekkaojanen
  Subscriber
  
  Hi Greg,
  Tried the test you suggested, and it succeeded, so there seems to be no problem with the "fdtd.eval(script)" command itself.
  One thing that comes to my mind is that inverse design plots the result during optimization. I wonder if that can cause an error, since there is no GUI? Is it possible to disable the plotting of results during inverse design?
  Are you aware if anyone else has tried to run inverse design on a server with no GUI, and if they have run into any issues?
  I have also requested that we install new version of Lumerical to the server, but I am not sure if they are willing to do that anytime soon
- January 16, 2024 at 8:38 am
  
  Greg Baethge
  Ansys Employee
  
  Hi Samu,
  Thanks for the update. I think you can disable the plot by setting plot_history to false in the Optimization object (see Getting Started with lumopt). That said, I don't think it should be an issue as you're running the optimization using a virtual display (Xvfb).
  Are you running an optimization you got from one of our examples or is it something you set up/modified?
  Depending on how much space you have on your home directory, you could try and install Lumerical locally, to run some tests. The process is explained here. Note this won't take care of any dependency, so it will only work if all the required libraries are already installed.
- January 26, 2024 at 1:56 pm
  
  samupekkaojanen
  Subscriber
  
  We installed Lumerical v241 on the cluster. I am actually now running into the following error when running lumtest.py on the cluster with the new version:
  
  Traceback (most recent call last):
  File "/home/ojanen4/lumtest.py", line 12, in
  fdtd = lumapi.FDTD()
  File "/home/opt/Lumerical/v241/api/python/lumapi.py", line 1541, in __init__
  super(FDTD, self).__init__('fdtd', filename, key, hide, serverArgs, remoteArgs, **kwargs)
  File "/home/opt/Lumerical/v241/api/python/lumapi.py", line 1196, in __init__
  handle = self.__open__(iapi, product, key, hide, serverArgs, remoteArgs)
  File "/home/opt/Lumerical/v241/api/python/lumapi.py", line 1415, in __open__
  raise LumApiError(error)
  lumapi.LumApiError: 'Exception [::appOpened]: Session not found'
  
  This occurs at line "fdtd = lumapi.FDTD()". This worked fine in the older version. What could be causing this?
- January 26, 2024 at 5:29 pm
  Lito
  Ansys Employee
  @samupekkaojanen,
  The error below, indicates that the API was not able to open/launch the Lumerical FDTD CAD/GUI, and could not find the CAD/GUI session.
```
lumapi.LumApiError: 'Exception [::appOpened]: Session not found'
```
  Are you running CAD jobs (scripts/API) on a headless cluster? You can set the environment variable below on the 2023R2 and newer releases, prior to running the API/script, if you are not using virtual display (Xvfb);
```
export QT_QPA_PLATFORM=offscreen
```
  Then set lumapi to "hide" the CAD/GUI session:
```
lumapi.FDTD(hide=True)
```
  See our KB for details: > Running CAD jobs on headless Linux systems – Ansys Optics
  Hope this helps.
- February 26, 2024 at 12:51 pm
  
  samupekkaojanen
  Subscriber
  
  Hi Lito,
  I am using xvfb, but tried also with your suggestion. I still get the same error.
- February 26, 2024 at 1:00 pm
  
  samupekkaojanen
  Subscriber
  
  I now tried to run a basic lsf script (simple.lsf) in FDTD with this command:
  time xvfb-run fdtd-solutions -nw -run simple.lsf
  But even this now fails. I get the following error:
  /home/opt/Lumerical/v241/bin/fdtd-solutions-app: error while loading shared libraries: libglut.so.3: cannot open shared object file: No such file or directory
  This worked in the older version on the cluster, but with this newly installed version (v241), these errors appear.
- February 26, 2024 at 7:32 pm
  
  Lito
  Ansys Employee
  
  @samupekkaojanen,
  Your cluster is missing the package that provides the "liblgut3" libraries. Please ask your IT/cluster admin to install the missing packages/libraries required by Lumerical on your cluster.
  >>Required libraries for Ansys Lumerical on Linux – Ansys Optics
- February 28, 2024 at 9:40 am
  
  samupekkaojanen
  Subscriber
  
  Hi Lito,
  Thank you for the suggestion! The problem with libraries is now fixed, but I’m still running into issues when trying to run a test .lsf script. I’m getting the following error (I’ve hidden the license server).
  Firstly I get this error:
  Messages file /ansys_inc/shared_files/licensing/language/en-us/ansysli_msgs.xml does not exist.
  
  And secondly this license error:
  Feature: lumerical_gui
  License path: 1055@****.**
  FlexNet Licensing error:-5,147
  Error: Failed to checkout feature ‘lumerical_gui'
  No such feature exists.
  Feature: lumerical_gui
  License path: 1055@****.** Licensing error:-5,147
  
  Would you like to reconfigure your license settings?, Response: No
  The license server and port should be correct, so I am not sure what is the problem here.
  Would you have any suggestions on how to fix this?
- February 28, 2024 at 5:25 pm
  
  Lito
  Ansys Employee
  
  @samupekkaojanen,
  You can ignore the first message/notice about the XML file. This will not prevent you from checking out the license.
  The second error indicates that there is no "lumerical_gui" (enterprise license) on the server. Set your Optics Launcher > license configuration (GUI) to obtain the license from the server on the port used by the Ansys license manager (consult IT/license admins for this information).
  >> Lumerical license configuration with the Ansys Optics Launcher – Ansys Optics
  Otherwise, if you have the correct license server information, configure the "License.ini" file to checkout the "Standard" licenses from your server.
  >>Lumerical license configuration from the command line – Ansys Optics
- February 29, 2024 at 9:03 am
  
  samupekkaojanen
  Subscriber
  
  Hi Lito,
  Thank you. The license error is now resolved.
  However, I am now trying to run the topology optimization again, and am running into the exact same error as in the previous version of Lumerical. I am running the following commands:
  module load lumerical/v241
  cd $SLURM_SUBMIT_DIR
  export QT_QPA_PLATFORM=offscreen
  time "$LUMERICAL_ROOT/python/bin/python3" demux_runsim.py
  I am using the example .py file from here: https://optics.ansys.com/hc/en-us/articles/1500007188582-Topology-Optimization-of-a-4-channel-wavelength-demultiplexer-2D-TE
  And I get the following error:
  CONFIGURATION FILE {'root': '/home/opt/Lumerical/v241/api/python', 'lumapi': '/home/opt/Lumerical/v241/api/python'}
  Initializing super optimization
  Traceback (most recent call last):
  File "/home/ojanen4/demux_runsim.py", line 82, in
  runSim(params, eps_bg, eps_wg, x_pos, y_pos, size_x*1e-9, size_y*1e-9, filter_R, working_dir=working_dir, beta=1)
  File "/home/ojanen4/demux_runsim.py", line 61, in runSim
  opt.run(working_dir = working_dir)
  File "/home/opt/Lumerical/v241/api/python/lumopt/optimization.py", line 460, in run
  self.initialize(working_dir=working_dir)
  File "/home/opt/Lumerical/v241/api/python/lumopt/optimization.py", line 119, in initialize
  self.one_forward = check_one_forward_sim(self.optimizations[0])
  File "/home/opt/Lumerical/v241/api/python/lumopt/optimization.py", line 82, in check_one_forward_sim
  co_opt.sim = Simulation(self.workingDir, co_opt.use_var_fdtd, co_opt.hide_fdtd_cad)
  File "/home/opt/Lumerical/v241/api/python/lumopt/utilities/simulation.py", line 22, in __init__
  self.fdtd = lumapi.MODE(hide = hide_fdtd_cad) if use_var_fdtd else lumapi.FDTD(hide = hide_fdtd_cad)
  File "/home/opt/Lumerical/v241/api/python/lumapi.py", line 1541, in __init__
  super(FDTD, self).__init__('fdtd', filename, key, hide, serverArgs, remoteArgs, **kwargs)
  File "/home/opt/Lumerical/v241/api/python/lumapi.py", line 1196, in __init__
  handle = self.__open__(iapi, product, key, hide, serverArgs, remoteArgs)
  File "/home/opt/Lumerical/v241/api/python/lumapi.py", line 1415, in __open__
  raise LumApiError(error)
  lumapi.LumApiError: 'Exception [::appOpened]: Session not found'
  Exception ignored in:
  Traceback (most recent call last):
  File "/home/opt/Lumerical/v241/api/python/lumopt/utilities/simulation.py", line 67, in __del__
  self.fdtd.close()
  AttributeError: 'Simulation' object has no attribute 'fdtd'
- February 29, 2024 at 9:05 am
  
  samupekkaojanen
  Subscriber
  
  I also tried with xvfb (time xvfb-run "$LUMERICAL_ROOT/python/bin/python3" demux_runsim.py), but it does not work either. The code gets slightly farther, but then crashes with the following error:
  CONFIGURATION FILE {'root': '/home/opt/Lumerical/v241/api/python', 'lumapi': '/home/opt/Lumerical/v241/api/python'}
  Initializing super optimization
  Checking for one forward simulation : More than One Wavelengths range, one forward simulation is not possible
  Wavelength range of source object will be superseded by the global settings.
  Traceback (most recent call last):
  File "/home/ojanen4/demux_runsim.py", line 82, in
  runSim(params, eps_bg, eps_wg, x_pos, y_pos, size_x*1e-9, size_y*1e-9, filter_R, working_dir=working_dir, beta=1)
  File "/home/ojanen4/demux_runsim.py", line 61, in runSim
  opt.run(working_dir = working_dir)
  File "/home/opt/Lumerical/v241/api/python/lumopt/optimization.py", line 460, in run
  self.initialize(working_dir=working_dir)
  File "/home/opt/Lumerical/v241/api/python/lumopt/optimization.py", line 141, in initialize
  list(map(init_suboptimization, self.optimizations))
  File "/home/opt/Lumerical/v241/api/python/lumopt/optimization.py", line 137, in init_suboptimization
  cur_optimization.initialize(local_working_dir)
  File "/home/opt/Lumerical/v241/api/python/lumopt/optimization.py", line 669, in initialize
  self.geometry.add_geo(self.sim, start_params, only_update = False)
  File "/home/opt/Lumerical/v241/api/python/lumopt/geometries/topology.py", line 410, in add_geo
  fdtd.eval(script)
  File "/home/opt/Lumerical/v241/api/python/lumapi.py", line 1430, in eval
  evalScript(self.handle, code, True)
  File "/home/opt/Lumerical/v241/api/python/lumapi.py", line 300, in evalScript
  _evalScriptInternal(s, code)
  File "/home/opt/Lumerical/v241/api/python/lumapi.py", line 288, in _evalScriptInternal
  raise LumApiError("Failed to evaluate code")
  lumapi.LumApiError: 'Failed to evaluate code'
  QProcess: Destroyed while process ("/home/opt/Lumerical/v241/bin/fdtd-solutions") is still running.
- February 29, 2024 at 9:27 am
  
  samupekkaojanen
  Subscriber
  
  I also updated the Lumerical on my Windows laptop, and it is now also crashing, but at a different point in the code:
  CONFIGURATION FILE {‘root’: ‘C:\\Program Files\\Lumerical\\v241\\api\\python’, ‘lumapi’: ‘C:\\Program Files\\Lumerical\\v241\\api\\python’}
  Initializing super optimization
  Checking for one forward simulation : More than One Wavelengths range, one forward simulation is not possible
  Wavelength range of source object will be superseded by the global settings.
  Wavelength range of source object will be superseded by the global settings.
  Wavelength range of source object will be superseded by the global settings.
  Wavelength range of source object will be superseded by the global settings.
  Wavelength range of source object will be superseded by the global settings.
  Making adjoint solves
  Running solves
  Processing adjoint solves
  FOM = 0.975600292420355 (1 – 0.024399707579645047)
  FOM = 0.9134124921948557 (1 – 0.08658750780514435)
  FOM = 0.9635248911794267 (1 – 0.036475108820573254)
  FOM = 0.0029434785871421676
  FOM = 0.9126453599424964 (1 – 0.0873546400575036)
  Traceback (most recent call last):
  File “C:\Users\ojanen4\Documents\cluter testing\demux_runsim.py”, line 82, in
  runSim(params, eps_bg, eps_wg, x_pos, y_pos, size_x*1e-9, size_y*1e-9, filter_R, working_dir=working_dir, beta=1)
  File “C:\Users\ojanen4\Documents\cluter testing\demux_runsim.py”, line 61, in runSim
  opt.run(working_dir = working_dir)
  File “C:\Program Files\Lumerical\v241\api\python\lumopt\optimization.py”, line 460, in run
  self.initialize(working_dir=working_dir)
  File “C:\Program Files\Lumerical\v241\api\python\lumopt\optimization.py”, line 350, in initialize
  self.optimizer.initialize(start_params=start_params,
  File “C:\Program Files\Lumerical\v241\api\python\lumopt\optimizers\optimizer.py”, line 100, in initialize
  self.reset_start_params(start_params, self.scale_initial_gradient_to)
  File “C:\Program Files\Lumerical\v241\api\python\lumopt\optimizers\optimizer.py”, line 108, in reset_start_params
  self.auto_detect_scaling(scale_initial_gradient_to)
  File “C:\Program Files\Lumerical\v241\api\python\lumopt\optimizers\optimizer.py”, line 115, in auto_detect_scaling
  gradients = self.callable_jac(params)
  File “C:\Program Files\Lumerical\v241\api\python\lumopt\optimizers\minimizer.py”, line 34, in callable_jac_local
  fom_gradients = callable_jac(params_over_scaling_factor) / self.scaling_factor
  File “C:\Program Files\Lumerical\v241\api\python\lumopt\optimization.py”, line 297, in callable_jac
  jac_list = pool.map(process_adjoint_solves, enumerate(self.optimizations))
  File “C:\Program Files\Lumerical\v241\python\lib\multiprocessing\pool.py”, line 364, in map
  return self._map_async(func, iterable, mapstar, chunksize).get()
  File “C:\Program Files\Lumerical\v241\python\lib\multiprocessing\pool.py”, line 771, in get
  raise self._value
  File “C:\Program Files\Lumerical\v241\python\lib\multiprocessing\pool.py”, line 125, in worker
  result = (True, func(*args, **kwds))
  File “C:\Program Files\Lumerical\v241\python\lib\multiprocessing\pool.py”, line 48, in mapstar
  return list(map(*args))
  File “C:\Program Files\Lumerical\v241\api\python\lumopt\optimization.py”, line 290, in process_adjoint_solves
  jac = optimization.calculate_gradients()
  File “C:\Program Files\Lumerical\v241\api\python\lumopt\optimization.py”, line 910, in calculate_gradients
  grad_name = self.geometry.calculate_gradients_on_cad(self.sim, ‘forward_fields’, ‘adjoint_fields’, self.scaling_factor)
  File “C:\Program Files\Lumerical\v241\api\python\lumopt\geometries\topology.py”, line 290, in calculate_gradients_on_cad
  sim.fdtd.eval((‘params = struct;'
  File “C:\Program Files\Lumerical\v241\api\python\lumapi.py”, line 1430, in eval
  evalScript(self.handle, code, True)
  File “C:\Program Files\Lumerical\v241\api\python\lumapi.py”, line 300, in evalScript
  _evalScriptInternal(s, code)
  File “C:\Program Files\Lumerical\v241\api\python\lumapi.py”, line 288, in _evalScriptInternal
  raise LumApiError(“Failed to evaluate code”)
  lumapi.LumApiError: ‘Failed to evaluate code'
  Here crashing seems to occur at this line of code:
  sim.fdtd.eval(('params = struct;'
  'params.eps_levels=[{0},{1}];'
  'params.filter_radius = {2};'
  'params.beta = {3};'
  'params.eta = {4};'
  'params.dx = {5};'
  'params.dy = {6};'
  'params.dz = 0.0;'
  'dF_dp = topoparamstogradient(params,topo_rho,dF_dEps);').format(self.eps_min,self.eps_max,self.filter_R,self.beta,self.eta,self.dx,self.dy) )
- February 29, 2024 at 9:55 am
  
  samupekkaojanen
  Subscriber
  
  Quick update: Shape optimization works fine both on my Windows laptop and on the headless Linux server. I found out that using xvfb-run to run Python scripts leads to crashing. When I run the shape optimization like this, it is working perfectly:
  export QT_QPA_PLATFORM=offscreen
  time "$LUMERICAL_ROOT/python/bin/python3" splitter_with_arms.py
  Topology optimization still doesn't work

Viewing 16 reply threads

The topic ‘Error when running a topology optimization on a cluster (Linux)’ is closed to new replies.

Photonics

Error when running a topology optimization on a cluster (Linux)

Ansys Assistant

Photonics

Error when running a topology optimization on a cluster (Linux)

Edit Discussion

Ansys Assistant

Welcome to Ansys Assistant!