Computer suitable for Lumerical simulation

- May 6, 2021 at 5:04 pm
  
  xyjxyj1992
  Subscriber
  
  Hello, we are trying to get a more powerful PC for running lumerical simulation. Does anyone have the recommendation for CPUs or RAMs which can help accelerate the simulation or just go for higher volume and faster clock rate? Is there any setting which can also help increase the calculation efficiency (like letting GPU come into play)? Thank you so much!!
- May 6, 2021 at 6:03 pm
  
  Lito
  Ansys Employee
  
  Lumerical does not support GPU acceleration. Best option is a multi-processor computer. These systems provide good performance because each processor has it's own memory bus connection to the RAM.Simulation speeds are typically limited by the memory bandwidth between RAM and processor, so having one memory bus per processor means the simulation speed tends to scale well with the number of processors. See this this post for more information. This post on our FDTD tests/benchmarks might come handy.
  Best Lito
- May 7, 2021 at 2:25 am
  
  xyjxyj1992
  Subscriber
  
  Thank you so much for your reply! Do you mean more cores (or threads) the better or do you mean more than one CPU in a computer can help accelrate the simulation?
- May 8, 2021 at 12:59 am
  
  Lito
  Ansys Employee
  
  From the article, I think the more memory channels supported by your CPU the better. e.g. an 8 core system with 8 memory channels will perform better than one with only 2 or 4 memory channels. Dual sockets/CPUs might offer better CPU-Memory bandwidth than a single CPU system.
- July 28, 2021 at 5:17 pm
  
  xyjxyj1992
  Subscriber
  
  Hi Lito, I actually have another question regarding the benchmark scores on this website /forum/discussion/25782/lumerical-fdtd-simulation-benchmarks. Is the results of M5.24x large 1130Mnodes/s with two CPUs running simltaneously or just with a single CPU?
- July 28, 2021 at 7:11 pm
  
  Lito
  Ansys Employee
  
  As shown in the table, the instance has 2 Intel Xeon Platinum 8175 CPUs with a total of 96 threads/processes.
  The FDTD solver speed for the sample simulation using 96 threads is 1130 Mnodes/s
- July 29, 2021 at 4:25 pm
  
  xyjxyj1992
  Subscriber
  
  I see, thanks for your reply, does that mean with single 8175 CPU, you can only reach 565Mnodes/s?
- July 29, 2021 at 4:39 pm
  
  Lito
  Ansys Employee
  
  ,I don't think that would be the case and the difference in performance is not linear in terms of the number of CPU/cores as seen on the results from the article. Will have to test on 1 CPU to get the FDTD solver speed for the same simulation file used in the article. Speed could vary depending on the simulation and machine.
- July 30, 2021 at 4:22 pm
  
  xyjxyj1992
  Subscriber
  
  Thanks for your reply, I understand the last part. But what I really want to know here is that does 2 CPUs grant you twice of the speed or less compared to a single CPU under the same configration and the same simulation file? Thank you!
- July 30, 2021 at 4:34 pm
  
  Lito
  Ansys Employee
  
  Yes, 2 CPU/processor machines have better performance but not in a linear way vs the FDTD solver rate/speed. i.e. doubling the bandwidth does not mean that the FDTD solver rate will double. As seen in our example increasing the number of processes/threads used to run the simulation is not linear in terms of the FDTD solver rate. Running the simulation from 8 to 16 to 36 processes does not provide a linear performance increase.
- September 22, 2021 at 5:20 pm
  
  16arnoldk
  Subscriber
  
  Hi Lito This line of conversation and supporting documentation has been very helpful to understand the differences in performance for various configurations. It seems that it is almost always the case that a dual processor configuration would achieve better performance than a single processor (with double the specs). However, this brings me to a new issue which I have not yet been able to uncover an answer to:
  Is there a way to specify whether a single process is applied half each to the two CPUs in a dual setup vs. only on one of the CPUs in a dual setup?
  This becomes important when considering the allocation of solvers (which we have a finite supply of). If running a gradient / particle swarm optimization or a sweep with several simulations in parallel, I notice that dual processor machines will take up solvers following: #solvers = 2*#simulations running concurrently. Thus, for a modest number of particles in one generation, we quickly add up to our limit.
  While dividing tasks across both processors is advantageous for a one-off simulation, I would argue that this benefit is reduced when considering many simulations at once. There should be very little performance difference between (for an example of running 8 simulations in parallel with access to dual quad core processors) running half of the 8 simulations across 8 cores (4 from each processor) vs. running 4 simulations across 4 cores of one processor and 4 across 4 for the other. So, is this something that I can control in the FDTD resource manager? Or perhaps elsewhere?
  Thanks!
  KP
- September 22, 2021 at 8:41 pm
  
  Lito
  Ansys Employee
  
  I think you are talking about pinning the process to a specific CPU/thread. You can use Intel MPI for thread/process pinning. Add the "-env" options in the FDTD advanced configuration on Windows.
  Best Lito
- September 23, 2021 at 2:13 pm
  
  16arnoldk
  Subscriber
  
  Hi Lito Thanks a lot! This looks like the exact solution!
  Best KP
- September 23, 2021 at 10:49 pm
  
  Lito
  Ansys Employee
  
  You are welcome.
  Best Lito