Stiff Chemistry Combustion Simulation on the Native GPU Solver

- August 4, 2025 at 9:10 pm
  
  abtharpe42
  Subscriber
  
  I'm trying to figure out how to use the native gpu solver in Fluent 2024 R1 on my desktop to simulate a combustor with intricate nozzle geometry. My mesh has almost 4.8 million poly-hexcore cells to get good resolution in the small nozzle passageways. I can probably get the cell count decently lower than this. The chemkin mechanism I'm using is the Stagni 2023 mechanism with roughly 30 species and 200 reactions. I know Fluent 2025 R2 just released with the Chemistry Agglomeration feature for the native GPU solver, but I probably won't have access to that for a while. I'm stuck using 2024 R1, which has no chemistry acceleration and is strictly allows Direct Integration only. I also set my simulations to double percision. My workstation desktop has an Intel i9-14900 CPU with 8 p-cores and 16 e-cores (hyperthreading disabled) and 64 GB of RAM. The GPU is an NVIDIA RTX Ada 2000 with 16 GB of VRAM. My attempts to run a simulation up to this point has yielded slow intialization phases, especially with hybrid-initialization, and seemingly freezing or stalling at the start of the calculation phase with no progress for minutes on end. I need to know if my combustion scenario that I've described above is overkill for this computer, as I have not found a straight answer to this online. I suspect that it is in fact too heavy for my hardware, considering the long wait times. When using 2 cores, my CPU utilization stays at 100% and my GPU oscillates anywhere between 30% to 100%. What recommendations, advice, or rules of thumb do you guys have that could help me make sure that I am utilizing my GPU as optimally as possible?
- August 7, 2025 at 2:11 pm
  
  jcooper
  Ansys Employee
  
  Hi:
  A good rule of thumb is about 150 k nodes per processor, so I would say you are over the limit, as combustion is one of the more data and memory intensive types of simulation you can run.
  To avoid GPU memory overloading, it is recommended that you match the number of CPUs to the number of dedicated GPUs. This approach is the simplest way to optimize GPU resource utilization across hardware. GPU remapping can be used if there is a many-to-one situation for GPU and CPU. For more information on GPU remapping, you can search on this topic in the Fluent help documentation:
  https://ansyshelp.ansys.com/Views/Secured/corp/v252/en/flu_ug/flu_ug_sec_gpu_solver_starting.html#flu_ug_gpgpu_solver_remapping
- August 13, 2025 at 5:23 pm
  
  abtharpe42
  Subscriber
  
  Sorry for the late response. I was under the impression that a GPU could do the work of multiple CPUs and that the Ada 2000 was decently powerful. Are you saying that even when using the GPU I should stay at about 150k cells for a combustion simulation as if I were using CPUs instead? I've read that GPUs grant the most benefit when the mesh is pretty dense, so that's why I made it several million cells. Eight p-cores ran this simulation in almost 2 days, by the way.
- August 14, 2025 at 4:08 pm
  
  jcooper
  Ansys Employee
  
  Hi:
  The benefit of GPU is limited to certain tasks, such as solving equations. Not all models and tasks benefit equally. (The ADA 2000 GPU has 16 GB of memory, which is still relatively low when you consider the number of equations that are solved in combustion cases.)
  While CPUs can switch between different instruction sets rapidly, a GPU simply takes a high volume of the same instructions and pushes them through at high speed. A GPU can complete simple and repetitive tasks faster than a CPU because it can break the tasks down into smaller components and finish them in parallel.
- August 14, 2025 at 4:19 pm
  
  abtharpe42
  Subscriber
  
  Alright, I think I understand the limitations a lot better now. I think I'll stick with just using my CPUs for now. I have a couple more pressing questions on that front that I would really appreciate answers to:
  1) For Intel CPUs that have hybrid architectures, i.e. both performance and efficiency cores (8 and 16 respectively in my case), is it better to assign a mix of both core types for heavy simulations to simply spread out the computational load to more cores, or is better to stay on the 8 p-cores only as they have the higher raw performance?
  2) Is there a way to force Fluent to use only the p-cores on Windows? I've looked around quite bit to find an answer to this, but I've gotten nowhere.
- August 14, 2025 at 4:26 pm
  
  abtharpe42
  Subscriber
  
  To clarify the second question, I meant to say: Is there a way to force Fluent to automatically use p-cores without me manually having to set the affinity in Task Manager every time I open Fluent?
- August 14, 2025 at 4:47 pm
  
  jcooper
  Ansys Employee
  
  Hi:
  Generally, the more cores you assign to a job the better your speed up should be. You can test for this, and I would recommend it. You may notice dimishing returns beyond a certain number of cores if inter-core communication ( bus speeds) are slow, but there should always be some gain.
  There is no way to force Fluent to use a specific type of core. Process affinity is your best bet.
- August 14, 2025 at 4:48 pm
  
  abtharpe42
  Subscriber
  
  Cool. Thanks for the help.
- August 14, 2025 at 8:45 pm
  
  jcooper
  Ansys Employee
  
  You're welcome!