Hardware resource optimization

- July 14, 2023 at 4:22 pm
  
  Kaisarbek Omirzakhov
  Subscriber
  
  I am using Lumerical for 3D FDTD simulations. I have the following
  problem when running 2 simulations at the same time.
  
  Simulation 1 alone using 32 cores takes 10 hours.
  Simulation 2 alone using 32 cores takes 10 hours.
  When simulation 1 and simulation 2 are running at the same time using
  32 cores each, it takes 20 hours for each one to finish.
  
  Is there a way to optimize the resources for parallel simulations?
  Do you have suggestions for optimizing resources for big simulations?
  
  Here are the details of the simulation machine:
  OS - Red Hat Enterprise Linux Server 7.9
  Memory - 252 GiB
  Processor - Intel Xeon(R) Gold 6314U CPU @2.3GHzx64
  GNOME - v 3.28.2
  OS type - 64-bit
  Disk - 1.6 TB
- July 17, 2023 at 11:32 pm
  
  Guilin Sun
  Ansys Employee
  
  I guess that you have only one engine license. If so, since one engine license can allow up to 32 cores, it is reasonable that when the two simulation are at the same time, each needs twice the time as when running single file. Please refer to
  https://optics.ansys.com/hc/en-us/articles/360052724713-List-of-licensed-features-by-product
  and check your license.
  
  Please also check if one simulation really needs 32 cores. Maybe 16 or even 8 is sufficient. Due to parallel computing's limit on scaling, not every simulation needs more cores than 4 or 8, since the data communications among different blocks (each core simulate one block of the original file) of the simulation will also need time. when such communication time is not negligible, the scaling will deviate from linear relationship. from there, the more processes are used, the slower the simulation compared to n*#cores. Therefore the simulation efficiency is declined. It might be better to simulate two files at the same time but each simulation just uses 16 cores. Please set the cores/processes properly at "Resources" after some testing.
- July 18, 2023 at 5:05 pm
  
  Kaisarbek Omirzakhov
  Subscriber
  
  Hi Guilin!
  Thanks for your feedback. I have figured out the issue. I can summarize my finding here for other people who will face similar problem.
  1. I have enough number of licenses. So this doesn't affect the simulation time.
  2. In short, the bottleneck for simulation speed is the memory bandwidth. As you mentioned before, increasing # of cores, doesn't linearly scale the simulation time. In my case, 32 cores give the fastest simulation time for a specific design. When I simulate another similar design in parallel, using another 32 cores on the same machine, the simulation speed dropped twice. This is because the communication between CPU and RAM was already at the cap for a single simulation. Adding another simulation in parallel practically takes 2 times longer time.
  I hope this helps for other people as well. I am attaching some useful links from Ansys website regarging performance optimization.
  Information on Hardware Specifications
  Getting the Best FDTD Performance
  Best,
  Kaisar
- July 18, 2023 at 6:00 pm
  
  Guilin Sun
  Ansys Employee
  
  Thank you for your summary. Memory bandwidth is in deed one main factor that affects simulation speed.
  If you have enough licenses, and enough cores (seems 64 cores), it should not be slow down since they should use different cores. For clusters, you can use different nodes, each node has 32 cores used.
  Please explore more.

Viewing 3 reply threads

The topic ‘Hardware resource optimization’ is closed to new replies.

Photonics

Hardware resource optimization

Ansys Assistant

Photonics

Hardware resource optimization

Edit Discussion

Ansys Assistant

Welcome to Ansys Assistant!