{"id":198368,"date":"2026-04-30T03:27:01","date_gmt":"2026-04-30T03:27:01","guid":{"rendered":"https:\/\/innovationspace.ansys.com\/knowledge\/?post_type=topic&#038;p=198368"},"modified":"2026-05-05T15:19:50","modified_gmt":"2026-05-05T15:19:50","slug":"freeflow-gpu-buying-guide","status":"publish","type":"topic","link":"https:\/\/innovationspace.ansys.com\/knowledge\/forums\/topic\/freeflow-gpu-buying-guide\/","title":{"rendered":"FreeFlow GPU Buying Guide"},"content":{"rendered":"<p style=\"text-align: center\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-198408 size-full aligncenter\" src=\"https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2026\/04\/FreeFlow-GPU-Buying-Guide.png\" alt=\"\" width=\"1920\" height=\"650\" srcset=\"https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2026\/04\/FreeFlow-GPU-Buying-Guide.png 1920w, https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2026\/04\/FreeFlow-GPU-Buying-Guide-300x102.png 300w, https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2026\/04\/FreeFlow-GPU-Buying-Guide-1024x347.png 1024w, https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2026\/04\/FreeFlow-GPU-Buying-Guide-768x260.png 768w, https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2026\/04\/FreeFlow-GPU-Buying-Guide-1536x520.png 1536w, https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2026\/04\/FreeFlow-GPU-Buying-Guide-24x8.png 24w, https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2026\/04\/FreeFlow-GPU-Buying-Guide-36x12.png 36w, https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2026\/04\/FreeFlow-GPU-Buying-Guide-48x16.png 48w\" sizes=\"auto, (max-width: 1920px) 100vw, 1920px\" \/><\/p>\n<p>&nbsp;<\/p>\n<blockquote>\n<p style=\"text-align: center\"><em>With <strong>Ansys FreeFlow\u2122 <\/strong>smoothed-particle hydrodynamics (SPH) simulation software you can use one or more <strong>Graphic Processing Units (GPUs)<\/strong> to process your simulations.\u00a0Before investing in new hardware, see the FAQs below to find guidelines and recommendations.<\/em><\/p>\n<p>&nbsp;<\/p><\/blockquote>\n<h3  id=\"FREEFLOW-GPU-PERFORMANCE-BENCHMARK\"><strong><span style=\"font-size: 50px;font-weight: 900;color: #fedb8d\">\/<\/span> FreeFlow GPU Performance Benchmark<\/strong><\/h3>\n<ol>\n<li><a href=\"#Rockyperfbench\">FreeFlow GPU Performance Benchmark<\/a><\/li>\n<li><a href=\"#RockyGPU1\">The benefits of GPU<\/a><\/li>\n<li><a href=\"#RockyGPU2\">Performance Benchmark<\/a><\/li>\n<li><a href=\"#RockyGPU3\">Benchmark results for Ansys FreeFlow 2026 R1<\/a><\/li>\n<li><a href=\"#RockyGPU4\">Relevant conclusions on simulation performance<\/a><\/li>\n<\/ol>\n<h3  id=\"FREEFLOW-GPU-FAQS\"><strong><span style=\"font-size: 50px;font-weight: 900;color: #fedb8d\">\/<\/span> FreeFlow GPU FAQs<\/strong><\/h3>\n<p><strong><a href=\"#RockyFAQs\">FreeFlow GPU FAQs<\/a><\/strong><\/p>\n<ol>\n<li><strong><a href=\"#FAQ1\">Which license is required to run FreeFlow on GPUs?<\/a><\/strong><\/li>\n<li><strong><a href=\"#FAQ2\">Which GPU cards are recommended for use with FreeFlow?<\/a><\/strong><\/li>\n<li><strong><a href=\"#FAQ3\">What are the minimum requirements for GPU cards that will be used for running FreeFlow?<\/a><\/strong><\/li>\n<li><strong><a href=\"#FAQ4\">Which cards are best for running SPH?<\/a><\/strong><\/li>\n<li><strong><a href=\"#FAQ5\">Can you provide some examples for comparison?<\/a><\/strong><\/li>\n<li><strong><a href=\"#FAQ6\">There are a lot of cards on that list! How do I choose the one that is right for me?<\/a><\/strong><\/li>\n<li><strong><a href=\"#FAQ7\">I have only a mid-range budget. Can you recommend a card for me?<\/a><\/strong><\/li>\n<li><strong><a href=\"#FAQ8\">If you had to recommend one, all-around best card for most situations, which would it be?<\/a><\/strong><\/li>\n<li><strong><a href=\"#FAQ9\">Won\u2019t the (non-recommended) card I already have work just as well as a recommended one?<\/a><\/strong><\/li>\n<li><strong><a href=\"#FAQ10\">Assuming I use a recommended GPU card, how much faster can I expect my simulations to run?<\/a><\/strong><\/li>\n<\/ol>\n<div id=\"RockyFAQs\"><\/div>\n<div>\n<hr \/>\n<\/div>\n<div><\/div>\n<h2  id=\"FREEFLOW-GPU-FAQS\"><strong><span style=\"font-size: 50px;font-weight: 900;color: #fedb8d\">\/<\/span> FreeFlow GPU FAQs<\/strong><\/h2>\n<div id=\"FAQ1\"><\/div>\n<h3  id=\"1-WHICH-LICENSE-IS-REQUIRED-TO-RUN-FREEFLOW-ON-GPUS\"><strong><span style=\"font-size: 50px;font-weight: 900;color: #fedb8d\">\/<\/span>1. Which license is required to run FreeFlow on GPUs?<\/strong><\/h3>\n<p>The Ansys FreeFlow follows the Ansys HPC pack, besides that one Ansys FreeFlow license allows the user to run a single job with up to 75 graphic cards SM\u2019s (Streaming Multiprocessor)*. It is indifferent whether this is with a single or multiple GPU cards.<\/p>\n<p>For example, you do need 3 Ansys HPC Pack licenses to run your FreeFlow simulation in one A100 card (108 SMs) or in four RTX 3060 (28 SM\u2019s each). However, if you want to run one or two RTX 3060, you will not need to buy any Ansys HPC pack.<\/p>\n<p>To sum up, if you want to run any FreeFlow simulation in more than 75 SMs, you will need to get a new Ansys HPC pack. In the table below, we listed the amount of Ansys HPC Pack Licenses that you will need, accordingly, with the SMs.<\/p>\n<table>\n<tbody>\n<tr>\n<td width=\"287\"><strong>SM\u2019s<\/strong><\/td>\n<td width=\"288\"><strong>Ansys HPC Pack License<\/strong><\/td>\n<\/tr>\n<tr>\n<td width=\"287\">1 \u2013 75<\/td>\n<td width=\"288\">0<\/td>\n<\/tr>\n<tr>\n<td width=\"287\">76 \u2013 83<\/td>\n<td width=\"288\">1<\/td>\n<\/tr>\n<tr>\n<td width=\"287\">84 \u2013 107<\/td>\n<td width=\"288\">2<\/td>\n<\/tr>\n<tr>\n<td width=\"287\">108 \u2013 203<\/td>\n<td width=\"288\">3<\/td>\n<\/tr>\n<tr>\n<td width=\"287\">204 \u2013 587<\/td>\n<td width=\"288\">4<\/td>\n<\/tr>\n<tr>\n<td width=\"287\">588 \u2013 2123<\/td>\n<td width=\"288\">5<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Now consider another situation, in which you have one RTX 4090 card (128 SM\u2019s) or five RTX 3060 cards (140 SM\u2019s). In both cases you will need to invest in 3 Ansys HPC Pack Licenses (see the table below).<\/p>\n<h3 style=\"text-align: center\" style=\"text-align: center\"  id=\"HPC-FEATURES-REQUIRED-ACCORDING-TO-THE-CARDS-SM-COUNT\">HPC features required according to the card(s) SM count<\/h3>\n<table width=\"772\">\n<tbody>\n<tr>\n<td colspan=\"3\" width=\"379\"><strong>RTX 3060<\/strong><\/td>\n<td colspan=\"3\" width=\"393\"><strong>RTX 4090<\/strong><\/td>\n<\/tr>\n<tr>\n<td width=\"85\"><strong>Cards<\/strong><\/td>\n<td width=\"105\"><strong>SM Count<\/strong><\/td>\n<td width=\"189\"><strong>Ansys HPC Pack License<\/strong><\/td>\n<td width=\"98\"><strong>Cards<\/strong><\/td>\n<td width=\"105\"><strong>SM Count<\/strong><\/td>\n<td width=\"190\"><strong>Ansys HPC Pack License<\/strong><\/td>\n<\/tr>\n<tr>\n<td width=\"85\">1<\/td>\n<td width=\"105\">28<\/td>\n<td width=\"189\">0<\/td>\n<td width=\"98\">1<\/td>\n<td width=\"105\">128<\/td>\n<td width=\"190\">3<\/td>\n<\/tr>\n<tr>\n<td width=\"85\">2<\/td>\n<td width=\"105\">56<\/td>\n<td width=\"189\">0<\/td>\n<td width=\"98\">2<\/td>\n<td width=\"105\">256<\/td>\n<td width=\"190\">4<\/td>\n<\/tr>\n<tr>\n<td width=\"85\">3<\/td>\n<td width=\"105\">84<\/td>\n<td width=\"189\">2<\/td>\n<td width=\"98\">3<\/td>\n<td width=\"105\">384<\/td>\n<td width=\"190\">4<\/td>\n<\/tr>\n<tr>\n<td width=\"85\">4<\/td>\n<td width=\"105\">112<\/td>\n<td width=\"189\">3<\/td>\n<td width=\"98\">4<\/td>\n<td width=\"105\">512<\/td>\n<td width=\"190\">4<\/td>\n<\/tr>\n<tr>\n<td width=\"85\">5<\/td>\n<td width=\"105\">140<\/td>\n<td width=\"189\">3<\/td>\n<td width=\"98\">5<\/td>\n<td width=\"105\">640<\/td>\n<td width=\"190\">5<\/td>\n<\/tr>\n<tr>\n<td width=\"85\">6<\/td>\n<td width=\"105\">168<\/td>\n<td width=\"189\">3<\/td>\n<td width=\"98\">6<\/td>\n<td width=\"105\">768<\/td>\n<td width=\"190\">5<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><span style=\"font-size: 9.0pt\">*For more information about SMs, refer to the APPENDIX section.<\/span><\/p>\n<p><strong><em>Notes:<\/em><\/strong><\/p>\n<ul>\n<li><em>When using multiple GPU\u2019s, licensing is based on the total number of SM\u2019s across all GPU\u2019s irrespective of the number of GPU\u2019s.<\/em><\/li>\n<li><em>All available SM\u2019s are used on a GPU card. It is not possible to restrict usage to a subset of SM\u2019s.<\/em><\/li>\n<li><em>All GPU cards should reside on a single server, i.e., Ansys FreeFlow does not support distributed GPU computing.<\/em><\/li>\n<\/ul>\n<h3  id=\"2-WHICH-GPU-CARDS-ARE-RECOMMENDED-FOR-USE-WITH-FREEFLOW\"><strong><span style=\"font-size: 50px;font-weight: 900;color: #fedb8d\">\/<\/span>2. Which GPU cards are recommended for use with FreeFlow?<\/strong><\/h3>\n<p>As an SPH-based tool, FreeFlow performs best on GPUs with high VRAM capacity and high memory bandwidth. These two characteristics improve the neighbor list allocation and provide higher efficiency for our memory-bound algorithm. We selected a few GPUs that might be interesting to run FreeFlow:<\/p>\n<ul>\n<li><strong>Server<\/strong>: A30, A100, L40, H100, H100 NVL and H200.\n<ul>\n<li><strong>PROS<\/strong>: Essential for large-scale SPH simulations; highest memory bandwidth<\/li>\n<li><strong>CONS: <\/strong>More expensive; must be installed on a server rack; no video output<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<ul>\n<li><strong>Workstation<\/strong>: Quadro RTX A6000, RTX A2000, RTX A4000 and RTX A5000\n<ul>\n<li><strong>PROS:<\/strong> Good VRAM; can be installed on individual workstations; has video output<\/li>\n<li><strong>CONS:<\/strong> Cost is still high expensive<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<ul>\n<li><strong>Gaming<\/strong>: RTX 3060, RTX 3070, RTX 4060, RTX 4090 and RTX5060\n<ul>\n<li><strong>PROS<\/strong>: Good performance of SPH simulations; inexpensive; can be installed on individual workstations; has video output<\/li>\n<li><strong>CONS:<\/strong> VRAM is not good, it limits the maximum resolution of the SPH domain<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>For better results, use the above recommended GPU cards during FreeFlow processing.<\/p>\n<h3  id=\"3-WHAT-ARE-THE-MINIMUM-REQUIREMENTS-FOR-GPU-CARDS-THAT-WILL-BE-USED-FOR-RUNNING-FREEFLOW\"><strong><span style=\"font-size: 50px;font-weight: 900;color: #fedb8d\">\/<\/span>3. W<\/strong><strong>hat are the minimum requirements for GPU cards that will be used for running FreeFlow?<\/strong><\/h3>\n<p>There are some minimum requirements for GPU or multi-GPU processing, and you must choose one or more NVIDIA GPU cards (computing or gaming), according to the following criteria:<\/p>\n<p>At least 4 GB memory.<\/p>\n<p>Fast single-precision processing capabilities.<\/p>\n<p>At least 200 GB\/s memory bandwidth.<\/p>\n<p>A CUDA compute capability of 6.0 or higher.<\/p>\n<p>A graphics driver version that supports the CUDA version 12.8 toolkit or higher.<\/p>\n<p><em>(Access Nvidia website to see a CUDA driver table with a list of which driver version supports which toolkit version)<\/em><\/p>\n<h3  id=\"4-WHICH-CARDS-ARE-BEST-FOR-RUNNING-SPH\"><strong><span style=\"font-size: 50px;font-weight: 900;color: #fedb8d\">\/<\/span>4. Which cards are best for running SPH?<\/strong><\/h3>\n<p>For simulation with only SPH elements, choose a GPU with high single-precision performance and higher memory bandwidth so you will speed up your simulations. GPUs with larger memory allow you to run bigger cases with millions of SPH elements, so keep it in mind when selecting the hardware. Regarding the memory bandwidth, one should choose that with highest value possible. This feature will provide better efficiency for the memory-bound algorithm.<\/p>\n<h3  id=\"5-CAN-YOU-PROVIDE-SOME-EXAMPLES-FOR-COMPARISON\"><strong><span style=\"font-size: 50px;font-weight: 900;color: #fedb8d\">\/<\/span>5. Can you provide some examples for comparison?<\/strong><\/h3>\n<p>For SPH simulations, memory bandwidth is more important than single precision performance. Therefore, prefer the GPUs with the highest memory bandwidth and take the single precision as a secondary criterion. Considering the GPUs listed below, the best single precision performance are achieved with RTX 5090 and RTX 6000 Ada. The RTX 5090 is 15% faster than RTX 6000 Ada and has almost double the memory bandwidth. Moreover, the RTX 5090 is cheaper, making it a good choice for workstations.<\/p>\n<p>An interesting option to look is the H100 NVL, despite being expensive compared with RTX 5090 and RTX 6000 Ada, its memory bandwidth combines with memory size can easily compensate its lower single precision. This combination of memory size and memory bandwidth enables simulations with millions of SPH elements.<\/p>\n<p>Another interesting GPU to look at is the RTX 4090. This GPU has one of the highest single precision speeds listed below. Beside that, this card has over 1000 GB\/s of memory bandwidth. As a consequence, its cost-benefit is one of the best among the GPUs presented here, recommended for workstations.<\/p>\n<h3  id=\"6-THERE-ARE-A-LOT-OF-CARDS-ON-THAT-LIST-HOW-DO-I-CHOOSE-THE-ONE-THAT-IS-RIGHT-FOR-ME\"><strong><span style=\"font-size: 50px;font-weight: 900;color: #fedb8d\">\/<\/span>6. <\/strong><strong>There are a lot of cards on that list! How do I choose the one that is right for me?<\/strong><\/h3>\n<p>Choosing the card that will work best for you depends upon the type of simulations you will be running, how fast you need those simulations to complete, and the budget available to spend on your hardware and HPC licenses.<\/p>\n<p>The table below provides a quick comparison of the most common workstations, servers, and gaming cards.<\/p>\n<p><em>*Last update March 2026. Prices are estimated and can vary from region to region, market demand, and other reasons.<\/em><\/p>\n<div id=\"FAQ7\">\n<table style=\"height: 1795px\" width=\"730\">\n<tbody>\n<tr>\n<td width=\"90\"><\/td>\n<td width=\"47\"><strong>Card Name<\/strong><\/td>\n<td width=\"62\"><strong>Memory Size (GB)<\/strong><\/td>\n<td width=\"79\"><strong>Memory Bandwidth (GB\/s)<\/strong><\/td>\n<td width=\"34\"><strong>SMs<\/strong><\/td>\n<td width=\"78\"><strong>Single Precision (Tflops)<\/strong><\/td>\n<td width=\"67\"><strong>Double Precision (Gflops)<\/strong><\/td>\n<td width=\"162\"><strong>Estimated Purchase Price* (USD)<\/strong><\/td>\n<\/tr>\n<tr>\n<td rowspan=\"7\" width=\"90\"><strong>Workstation Cards<\/strong><\/td>\n<td width=\"47\">RTX A6000<\/td>\n<td width=\"62\">48<\/td>\n<td width=\"79\">768<\/td>\n<td width=\"34\">84<\/td>\n<td width=\"78\">38.71<\/td>\n<td width=\"67\">605<\/td>\n<td width=\"162\">4,600 \u2013 5,200<\/td>\n<\/tr>\n<tr>\n<td width=\"47\">RTX 6000 Ada<\/td>\n<td width=\"62\">48<\/td>\n<td width=\"79\">960<\/td>\n<td width=\"34\">142<\/td>\n<td width=\"78\">91<\/td>\n<td width=\"67\">1423<\/td>\n<td width=\"162\">6,800 \u2013 7,500<\/td>\n<\/tr>\n<tr>\n<td width=\"47\">RTX A2000<\/td>\n<td width=\"62\">12<\/td>\n<td width=\"79\">288<\/td>\n<td width=\"34\">26<\/td>\n<td width=\"78\">7.9<\/td>\n<td width=\"67\">124.8<\/td>\n<td width=\"162\">450 \u2013 600<\/td>\n<\/tr>\n<tr>\n<td width=\"47\">RTX A4000<\/td>\n<td width=\"62\">16<\/td>\n<td width=\"79\">448<\/td>\n<td width=\"34\">48<\/td>\n<td width=\"78\">19.2<\/td>\n<td width=\"67\">299.5<\/td>\n<td width=\"162\">800 \u2013 1,100<\/td>\n<\/tr>\n<tr>\n<td width=\"47\">RTX A5000<\/td>\n<td width=\"62\">24<\/td>\n<td width=\"79\">768<\/td>\n<td width=\"34\">64<\/td>\n<td width=\"78\">27.8<\/td>\n<td width=\"67\">433.9<\/td>\n<td width=\"162\">2,200 \u2013 2,500<\/td>\n<\/tr>\n<tr>\n<td width=\"47\">RTX PRO 2000<\/td>\n<td width=\"62\">16<\/td>\n<td width=\"79\">288<\/td>\n<td width=\"34\">34<\/td>\n<td width=\"78\">17.03<\/td>\n<td width=\"67\">266.2<\/td>\n<td width=\"162\">800 &#8211; 840<\/td>\n<\/tr>\n<tr>\n<td width=\"47\">RTX PRO 4000<\/td>\n<td width=\"62\">24<\/td>\n<td width=\"79\">432<\/td>\n<td width=\"34\">70<\/td>\n<td width=\"78\">24.05<\/td>\n<td width=\"67\">375.8<\/td>\n<td width=\"162\">1700 &#8211; 2000<\/td>\n<\/tr>\n<tr>\n<td colspan=\"8\" width=\"632\"><\/td>\n<\/tr>\n<tr>\n<td rowspan=\"7\" width=\"90\"><strong>Server Cards<\/strong><\/td>\n<td width=\"47\">A30<\/td>\n<td width=\"62\">24<\/td>\n<td width=\"79\">930<\/td>\n<td width=\"34\">56<\/td>\n<td width=\"78\">10.3<\/td>\n<td width=\"67\">5161<\/td>\n<td width=\"162\">3,500 \u2013 5,000<\/td>\n<\/tr>\n<tr>\n<td width=\"47\">A100<\/td>\n<td width=\"62\">40<\/td>\n<td width=\"79\">1555<\/td>\n<td width=\"34\">108<\/td>\n<td width=\"78\">19.5<\/td>\n<td width=\"67\">9746<\/td>\n<td width=\"162\">8,000 \u2013 11,000<\/td>\n<\/tr>\n<tr>\n<td width=\"47\">A100<\/td>\n<td width=\"62\">80<\/td>\n<td width=\"79\">1935<\/td>\n<td width=\"34\">108<\/td>\n<td width=\"78\">19.5<\/td>\n<td width=\"67\">9746<\/td>\n<td width=\"162\">12,000 \u2013 17,000<\/td>\n<\/tr>\n<tr>\n<td width=\"47\">H100<\/td>\n<td width=\"62\">80<\/td>\n<td width=\"79\">2039<\/td>\n<td width=\"34\">114<\/td>\n<td width=\"78\">51.22<\/td>\n<td width=\"67\">25610<\/td>\n<td width=\"162\">25,000 \u2013 32,000<\/td>\n<\/tr>\n<tr>\n<td width=\"47\">H100 NVL<\/td>\n<td width=\"62\">94<\/td>\n<td width=\"79\">3940<\/td>\n<td width=\"34\">132<\/td>\n<td width=\"78\">60.32<\/td>\n<td width=\"67\">30160<\/td>\n<td width=\"162\">30,000 \u2013 38,000<\/td>\n<\/tr>\n<tr>\n<td width=\"47\">L40<\/td>\n<td width=\"62\">48<\/td>\n<td width=\"79\">864<\/td>\n<td width=\"34\">142<\/td>\n<td width=\"78\">90.52<\/td>\n<td width=\"67\">1414<\/td>\n<td width=\"162\">9,500 \u2013 11,000<\/td>\n<\/tr>\n<tr>\n<td width=\"47\">H200<\/td>\n<td width=\"62\">141<\/td>\n<td width=\"79\">4800<\/td>\n<td width=\"34\">132<\/td>\n<td width=\"78\">60.32<\/td>\n<td width=\"67\">30160<\/td>\n<td width=\"162\">31,000 \u2013 42,000<\/td>\n<\/tr>\n<tr>\n<td colspan=\"8\" width=\"632\"><\/td>\n<\/tr>\n<tr>\n<td rowspan=\"11\" width=\"90\"><strong>Gaming Cards<\/strong><\/td>\n<td width=\"47\">RTX 3060 Ti<\/td>\n<td width=\"62\">8<\/td>\n<td width=\"79\">448<\/td>\n<td width=\"34\">38<\/td>\n<td width=\"78\">16.2<\/td>\n<td width=\"67\">253.1<\/td>\n<td width=\"162\">250 \u2013 300 (Used)<\/td>\n<\/tr>\n<tr>\n<td width=\"47\">RTX 3070<\/td>\n<td width=\"62\">8<\/td>\n<td width=\"79\">448<\/td>\n<td width=\"34\">46<\/td>\n<td width=\"78\">20.31<\/td>\n<td width=\"67\">317.4<\/td>\n<td width=\"162\">300 \u2013 400 (Used)<\/td>\n<\/tr>\n<tr>\n<td width=\"47\">RTX 3070 Ti<\/td>\n<td width=\"62\">8<\/td>\n<td width=\"79\">608.3<\/td>\n<td width=\"34\">48<\/td>\n<td width=\"78\">21.75<\/td>\n<td width=\"67\">339.8<\/td>\n<td width=\"162\">300 \u2013 400 (Used)<\/td>\n<\/tr>\n<tr>\n<td width=\"47\">RTX 3080<\/td>\n<td width=\"62\">10<\/td>\n<td width=\"79\">760<\/td>\n<td width=\"34\">68<\/td>\n<td width=\"78\">29.77<\/td>\n<td width=\"67\">465.1<\/td>\n<td width=\"162\">450 \u2013 600 (Used)<\/td>\n<\/tr>\n<tr>\n<td width=\"47\">RTX 3080 Ti<\/td>\n<td width=\"62\">12<\/td>\n<td width=\"79\">912.4<\/td>\n<td width=\"34\">80<\/td>\n<td width=\"78\">34.1<\/td>\n<td width=\"67\">532.8<\/td>\n<td width=\"162\">450 \u2013 600 (Used)<\/td>\n<\/tr>\n<tr>\n<td width=\"47\">RTX 3090<\/td>\n<td width=\"62\">24<\/td>\n<td width=\"79\">936.2<\/td>\n<td width=\"34\">82<\/td>\n<td width=\"78\">35.58<\/td>\n<td width=\"67\">556<\/td>\n<td width=\"162\">700 \u2013 900 (Used)<\/td>\n<\/tr>\n<tr>\n<td width=\"47\">RTX 3090 Ti<\/td>\n<td width=\"62\">24<\/td>\n<td width=\"79\">1008<\/td>\n<td width=\"34\">84<\/td>\n<td width=\"78\">40<\/td>\n<td width=\"67\">625<\/td>\n<td width=\"162\">700 \u2013 900 (Used)<\/td>\n<\/tr>\n<tr>\n<td width=\"47\">RTX 4090<\/td>\n<td width=\"62\">24<\/td>\n<td width=\"79\">1008<\/td>\n<td width=\"34\">128<\/td>\n<td width=\"78\">82.58<\/td>\n<td width=\"67\">1290<\/td>\n<td width=\"162\">1,600 \u2013 1,900<\/td>\n<\/tr>\n<tr>\n<td width=\"47\">RTX 5060<\/td>\n<td width=\"62\">8<\/td>\n<td width=\"79\">448<\/td>\n<td width=\"34\">30<\/td>\n<td width=\"78\">19.18<\/td>\n<td width=\"67\">299.6<\/td>\n<td width=\"162\">320 \u2013 380<\/td>\n<\/tr>\n<tr>\n<td width=\"47\">RTX 5060 Ti<\/td>\n<td width=\"62\">16<\/td>\n<td width=\"79\">448<\/td>\n<td width=\"34\">36<\/td>\n<td width=\"78\">23.7<\/td>\n<td width=\"67\">370.4<\/td>\n<td width=\"162\">450 \u2013 550<\/td>\n<\/tr>\n<tr>\n<td width=\"47\">RTX 5090<\/td>\n<td width=\"62\">32<\/td>\n<td width=\"79\">1790<\/td>\n<td width=\"34\">170<\/td>\n<td width=\"78\">104.8<\/td>\n<td width=\"67\">1637<\/td>\n<td width=\"162\">3700 \u2013 5000<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<h3  id=\"7-I-HAVE-ONLY-A-MID-RANGE-BUDGET-CAN-YOU-RECOMMEND-A-CARD-FOR-ME\"><strong><span style=\"font-size: 50px;font-weight: 900;color: #fedb8d\">\/<\/span>7. I have only a mid-range budget. Can you recommend a card for me?<\/strong><\/h3>\n<p>For a mid-range budget, you can choose between RTX 5060, RTX 5060 Ti, RTX A4000 and RTX PRO 4000. They are similar GPUs in terms of memory bandwidth and single-precision performance. Then you can select based on each GPU&#8217;s VRAM and price. The RTX 5060 is the cheapest of those four GPUs but has the lowest memory. The cost-benefit of the RTX 5060 Ti is great, providing the same VRAM as an RTX A4000. Besides that, one RTX 5060 Ti can cost half of one RTX A4000.<\/p>\n<h3  id=\"8-IF-YOU-HAD-TO-RECOMMEND-ONE-ALL-AROUND-BEST-CARD-FOR-MOST-SITUATIONS-WHICH-WOULD-IT-BE\"><strong><span style=\"font-size: 50px;font-weight: 900;color: #fedb8d\">\/<\/span>8. If you had to recommend one, all-around best card for most situations, which would it be?<\/strong><\/h3>\n<p>All in all, the H100 NVL is the FreeFlow team\u2019s preferred choice. It has one of the highest memory bandwidth among the GPUs and it delivers the most in terms of processing capacity given its cost.<\/p>\n<p>And if it turns out your simulation does not fit onto a single GPU, you can always use FreeFlow\u2019s support for multi-GPU to stack-up the GPU\u2019s combined memory.<\/p>\n<h3  id=\"9-WONT-THE-NON-RECOMMENDED-CARD-I-ALREADY-HAVE-WORK-JUST-AS-WELL-AS-A-RECOMMENDED-ONE\"><strong><span style=\"font-size: 50px;font-weight: 900;color: #fedb8d\">\/<\/span>9. Won\u2019t the (non-recommended) card I already have work just as well as a recommended one?<\/strong><\/h3>\n<p>Different GPU cards can have one order of magnitude difference in performance, which is why we have recommended only the cards that will have the best performance with FreeFlow. Just because FreeFlow appears to run fine on a non-recommended GPU card, does not mean that it is helping the processing performance. And if it is not helping the performance, then there is no point in running your simulations on GPUs.<\/p>\n<p>To see for yourself the huge range of performance differences, visit the Nvidia and review the Processing Power \/ Single Precision \/ Memory Bandwidth of the GPUs cards.<\/p>\n<div id=\"FAQ10\"><\/div>\n<h3  id=\"10-ASSUMING-I-USE-A-RECOMMENDED-GPU-CARD-HOW-MUCH-FASTER-CAN-I-EXPECT-MY-SIMULATIONS-TO-RUN\"><strong><span style=\"font-size: 50px;font-weight: 900;color: #fedb8d\">\/<\/span>10. Assuming I use a recommended GPU card, how much faster can I expect my simulations to run?<\/strong><\/h3>\n<p>Compared to a CPU with 48 cores, adding even one L40 has been shown to speed up the processing time 23 fold; add in one H100 NVL a 2-day simulation can be completed in hours. But it all depends upon what you are simulating, how large your case is, and how much budget you have.<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<h3 content_id=\"section1\" content_id=\"section1\"  id=\"APPENDIX-WHAT-ARE-STREAMING-MULTIPROCESSORS-SMS\">Appendix: What are Streaming Multiprocessors (SMs)?<\/h3>\n<p>Streaming Multiprocessors (SMs) are key components of the NVIDIA GPU\u2019s responsible for executing parallel computations, perform tasks related to rendering and other general-purpose computing. A SM consists of multiple CUDA cores and more powerful GPU cards typically contain more SM\u2019s.<\/p>\n<p>&nbsp;<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-182937\" src=\"https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2024\/07\/gpu-guide-300x109.png\" alt=\"\" width=\"881\" height=\"320\" srcset=\"https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2024\/07\/gpu-guide-300x109.png 300w, https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2024\/07\/gpu-guide-1024x373.png 1024w, https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2024\/07\/gpu-guide-768x279.png 768w, https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2024\/07\/gpu-guide-1536x559.png 1536w, https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2024\/07\/gpu-guide-2048x745.png 2048w, https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2024\/07\/gpu-guide-24x9.png 24w, https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2024\/07\/gpu-guide-36x13.png 36w, https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2024\/07\/gpu-guide-48x17.png 48w\" sizes=\"auto, (max-width: 881px) 100vw, 881px\" \/><\/p>\n<p style=\"text-align: left\"><em>GH100 Full GPU architecture with 144 SMs <\/em><\/p>\n<p>&nbsp;<\/p>\n<div id=\"Rockyperfbench\"><\/div>\n<h2  id=\"FREEFLOW-GPU-PERFORMANCE-BENCHMARK\"><strong><span style=\"font-size: 50px;font-weight: 900;color: #fedb8d\">\/<\/span>FreeFlow GPU Performance Benchmark<\/strong><\/h2>\n<p>Ansys FreeFlow is a Smoothed Particle Hydrodynamics (SPH) solver, it is meshless, Lagrangian computational method used to simulate the dynamics of continuum media, such as liquids and gases. Unlike traditional Grid-Based (Eulerian) methods that look at fluid passing through a fixed point, SPH follows the individual &#8220;elements&#8221; of the fluid as they move through space.<\/p>\n<p>It is well-suited to evaluate free surface flows, such as fluid sloshing, dam breaks, tire aquaplaning and other similar phenomena. Its Lagrangian nature allows tracking fluid interfaces without complex pre-processing tasks or other techniques, for example, mesh-deformation algorithms.<\/p>\n<div id=\"RockyGPU1\"><\/div>\n<h3 content_id=\"section1\" content_id=\"section1\"  id=\"THE-BENEFITS-OF-GPU\">The benefits of GPU<\/h3>\n<p>In any SPH simulation, the fluid is discretized into millions of elements. On CPU hardware, the computations are processed via a limited number of high-performance concurrent cores compared to GPUs partition. In other words, one GPU computes thousands of mathematical operations when a CPU handles a few hundred. Consequently, the computation time tends to be significantly reduced, allowing Ansys Freeflow to scale more efficiently on GPU hardware due to the SPH algorithm&#8217;s inherently parallel nature.<\/p>\n<p>Memory bandwidth plays an important role on SPH simulations. The SPH elements move and gather information of their neighbors at every time step. This leads to frequent, irregular memory access patterns. CPU can avoid waiting for memory using caches (L1, L2 or L3), but they lack memory bandwidth to handle millions of SPH elements efficiently. On the other hand, GPUs are designed for high workloads with a massive memory bandwidth. For example, comparing Intel Xeon Gold 6542Y and Nvidia H100 the ratio of memory between GPU\/CPU is about 9 times for GPU, but this ratio can be even higher. Therefore, this superior data-transfer capability makes GPU the natural choice for large-scale SPH simulations.<\/p>\n<div id=\"RockyGPU2\"><\/div>\n<p><strong>Performance Benchmark<\/strong><\/p>\n<p><strong>Criteria 1: SPH elements<\/strong><\/p>\n<p>To assess the performance of a Ansys FreeFlow we use a vehicle driving through a water puddle (car wading). The fluid is model using 16 millions of SPH elements and all geometries have about a total of 32 million triangles.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-198450 size-full\" src=\"https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2026\/04\/sphimage1.png\" alt=\"\" width=\"619\" height=\"324\" srcset=\"https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2026\/04\/sphimage1.png 619w, https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2026\/04\/sphimage1-300x157.png 300w, https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2026\/04\/sphimage1-24x13.png 24w, https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2026\/04\/sphimage1-36x19.png 36w, https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2026\/04\/sphimage1-48x25.png 48w\" sizes=\"auto, (max-width: 619px) 100vw, 619px\" \/><\/p>\n<p><em>Figure 1 \u2013 Car wading simulation.<\/em><\/p>\n<p>&nbsp;<\/p>\n<p><strong>Criteria 2: Processing type<\/strong><\/p>\n<p>Four different processing combinations were evaluated:<\/p>\n<ul>\n<li>CPU: Intel(R) Xeon(R) Gold 6542Y CPU @ 2.90 GHz on 48 cores<\/li>\n<li>1 GPU: NVIDIA H100, NVIDIA A100, NVIDIA L40<\/li>\n<li>2 GPUs: NVIDIA H100, NVIDIA A100, NVIDIA L40<\/li>\n<\/ul>\n<p><strong>Criteria 3: Performance measurement<\/strong><\/p>\n<p>Two measurements were taken at steady state to evaluate performance:<\/p>\n<ul>\n<li><strong>Simulation Pace (speed up)<\/strong>, which is the amount of hardware processing time (duration) required to advance the simulation two seconds. The simulation speed up metric is used considering the CPU pace as reference.<\/li>\n<li><strong>GPU Memory Usage<\/strong>, which is the amount of memory being used on the GPU while processing the simulation. In general, a lower memory usage allows for more SPH elements to be processed, and\/or more calculations to be performed.<\/li>\n<\/ul>\n<p><strong>Benchmark results for Ansys FreeFlow 2026 R1<\/strong><\/p>\n<p><strong>Relevant conclusions on simulation performance<\/strong><\/p>\n<p>Figure 2 shows the performance speed-up for the IISPH solver in Ansys FreeFlow. It is worth to mention that FreeFlow has another two other solvers WCSPH and DFSPH (Beta).<\/p>\n<ul>\n<li>The results show a huge performance gain of running a SPH simulation in a GPU. In the worst-case scenario, compared to a run in 48 CPU cores, you can reduce your simulation pace with one GPU in 23 times, approximately.<\/li>\n<li>The multi-GPU results also highlight the benefits of running SPH in a graphic card, with two GPUs you can achieve a time reduction about 65 times compared with a 48 CPU cores.<\/li>\n<\/ul>\n<p><a href=\"https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2026\/04\/freeflow_gpu_26r1_1.svg\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-198451 size-full\" src=\"https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2026\/04\/freeflow_gpu_26r1_1.svg\" alt=\"\" width=\"778\" height=\"371\" \/><\/a><\/p>\n<p><em>Figure 2 \u2013 GPU speed-up based upon Simulation Pace (compared with CPU 48x cores) achieved for the car wading.<\/em><\/p>\n<p>&nbsp;<\/p>\n<p><strong>Relevant conclusions on GPU memory consumption<\/strong><\/p>\n<p>Figure 3 shows the GPU memory usage for the SPH simulation within Ansys FreeFlow.<\/p>\n<ul>\n<li>A SPH simulation with 16 million SPH elements can be performed in just one GPU. Theoretically, it will be possible to run a case around 35 million SPH elements in 80 GB card, such as NVIDIA H100 NVL and NVIDIA A100 80GB PCIe.<\/li>\n<li>The maximum GPU memory consumption is about 27 GB in two GPUs. This result indicates Ansys Freeflow can be employed to analyze many complex problems.<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<p><a href=\"https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2026\/04\/freeflow_gpu_26r1_2.svg\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-198452 size-full\" src=\"https:\/\/innovationspace.ansys.com\/knowledge\/wp-content\/uploads\/sites\/4\/2026\/04\/freeflow_gpu_26r1_2.svg\" alt=\"\" width=\"804\" height=\"374\" \/><\/a><\/p>\n<p><em>Figure 3 \u2013 Total GPU memory consumption for the SPH simulation.<\/em><\/p>\n<p>&nbsp;<\/p>\n<div class=\"ReactFieldEditor\" data-automationtype=\"clientFormField\">\n<div class=\"ReactFieldEditor-core--display ReactFieldEditor-core--display-ReadOnly ReactFieldEditor-TextMultiLine\" role=\"textbox\" aria-label=\"Noun Descriptor Suggestion for First Reference, smoothed-particle hydrodynamics (SPH) simulation software, read only.\" data-fui-focus-visible=\"\">\n<div>\n<div class=\"od-FieldRenderer-text textOverlay_242c48ef\" style=\"text-align: right\">Ansys FreeFlow\u2122 smoothed-particle hydrodynamics (SPH) simulation software.<\/div>\n<\/div>\n<\/div>\n<div class=\"ReactFieldEditor-state\"><\/div>\n<\/div>\n<div class=\"ReactFieldEditor\" data-automationtype=\"clientFormField\">\n<div class=\"ReactFieldEditor-titleContainer\"><\/div>\n<\/div>\n<p>&nbsp;<\/p>\n","protected":false},"template":"","class_list":["post-198368","topic","type-topic","status-publish","hentry","topic-tag-ansys-freeflow","topic-tag-gpu-buying-guide","filter-by-application-ansys-freeflow"],"aioseo_notices":[],"acf":[],"custom_fields":[{"0":{"_edit_lock":["1779183575:2316"],"_edit_last":["17114"],"_aioseo_title":[null],"_aioseo_description":[null],"_aioseo_keywords":["a:0:{}"],"_aioseo_og_title":[null],"_aioseo_og_description":[null],"_aioseo_og_article_section":[""],"_aioseo_og_article_tags":["a:0:{}"],"_aioseo_twitter_title":[null],"_aioseo_twitter_description":[null],"application_name":[""],"_application_name":["field_64a80903c8e15"],"filter_by_optics_product":["Lumerical"],"_filter_by_optics_product":["field_64fb192ba3121"],"family":[""],"_family":["field_64a809229a857"],"siebel_km_number":[""],"_siebel_km_number":["field_63ecbffce60db"],"salesforce_km_number":[""],"_salesforce_km_number":["field_63ecc018e60dc"],"km_published_date":[""],"_km_published_date":["field_64c77704499dd"],"product_version":[""],"_product_version":["field_64c776cb4fd2e"],"_bbp_forum_id":["27796"],"_bbp_topic_id":["198473"],"_bbp_author_ip":["192.104.24.234"],"_bbp_last_reply_id":["0"],"_bbp_last_active_id":["198369"],"_bbp_last_active_time":["2026-04-25 03:27:01"],"_bbp_reply_count":["0"],"_bbp_reply_count_hidden":["0"],"_bbp_voice_count":["0"],"_btv_view_count":["628"],"_bbp_likes_count":["1"],"_wp_old_date":["2026-04-25"]},"test":"articlesansys-com"}],"_links":{"self":[{"href":"https:\/\/innovationspace.ansys.com\/knowledge\/wp-json\/wp\/v2\/topics\/198368","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/innovationspace.ansys.com\/knowledge\/wp-json\/wp\/v2\/topics"}],"about":[{"href":"https:\/\/innovationspace.ansys.com\/knowledge\/wp-json\/wp\/v2\/types\/topic"}],"version-history":[{"count":16,"href":"https:\/\/innovationspace.ansys.com\/knowledge\/wp-json\/wp\/v2\/topics\/198368\/revisions"}],"predecessor-version":[{"id":198473,"href":"https:\/\/innovationspace.ansys.com\/knowledge\/wp-json\/wp\/v2\/topics\/198368\/revisions\/198473"}],"wp:attachment":[{"href":"https:\/\/innovationspace.ansys.com\/knowledge\/wp-json\/wp\/v2\/media?parent=198368"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}