No, it definitely didn't fix the issue itself because I'm getting huge amounts of I/O and elapsed time exceeds CPU time by a significant amount.
I've tried using the -m option but it is not working as expected. To test this out, I'm running on one node with 88 cores, 4GB physical memory per core for a total of 352GB. I tried -m 256000
which should allocate a total of 250GB of memory, much less than the physical amount in the node, but what actually happened is Total memory allocated for solver = 490.634 GB
and then the run failed because it exceeded physical memory. So when the manual says that -m is the total workspace allocation, it doesn't seem to be functioning that way. I tried adjusting the value downward:
-m 128000
still allocated the same 490.6GB
-m 64000
allocated 477GB
-m 32000
allocated 373GB
-m 16000
allocated 166GBÂ
So it does seem to affect the allocation, but in a nonlinear way. I haven't determined what multiplier or algorithm it's using. I tried -m 3072
to see if it allocates memory per core, but then it only allocated 163GB and was no better than if I didn't set -m.
Any other ideas?