Hi Matt

Ok that model is very, very small with respect to GPU acceleration (of a solve).Â Without knowing anything about the cluster, other than the compute nodes have some model of Intel i7 cpus, I'd suggest to run a test to compare using one GPU.Â First change the model so that it is not solving for all 3050 sub-steps.Â We only need to solve for a few in order to compare compute performance.Â So change the loading set up to solve for maybe 10 sub-steps.Â Or maybe just the first 2-3 load steps.Â Next I usually start with 50,000 degrees of freedom per CPU core as a baseline test.Â If the CPU was a leading edge model then I'd take that down to around 30,000.Â But with 50k dof per core I'd try solving on 8 CPU cores to start (I also prefer even numbers!).Â When done make a copy of the output and pcs files, then solve again on 8 CPU cores plus 1 of the GPUs.Â Make a copy of the resulting output and pcs files.Â Lastly try using 4 CPU cores and 1 GPU.Â Save the files then report back the total CPU time and the total elapsed time for each solution.Â Â

Mike