Abstract
Computational hemodynamic modeling has been widely used in cardiovascular research and healthcare. However, the reliability of model predictions is largely dependent on the uncertainties of modeling parameters and boundary conditions, which should be carefully quantified and further reduced with available measurements. In this work, we focus on propagating and reducing the uncertainty of vascular geometries within a Bayesian framework. A novel deep learning (DL)-assisted parallel Markov chain Monte Carlo (MCMC) method is presented to enable efficient Bayesian posterior sampling and geometric uncertainty reduction. A DL model is built to approximate the geometry-to-hemodynamic map, which is trained actively using online data collected from parallel MCMC chains and utilized for early rejection of unlikely proposals to facilitate convergence with less expensive full-order model evaluations. Numerical studies on two-dimensional aortic flows are conducted to demonstrate the effectiveness and merit of the proposed method.
1 Introduction
Hemodynamics information, e.g., fractional flow reserve, wall shear stress, and blood flow pattern, are critical in cardiovascular research and personalized healthcare [1–3]. To quantify the functional information of hemodynamics from medical images, e.g., computed tomography and magnetic resonance imaging (MRI), image-based computational fluid dynamics (CFD) modeling has become a paradigm in cardiovascular study [1]. The standard workflow of constructing an image-based model includes (1) extraction of vascular anatomy from images as the geometric boundary, (2) numerical discretization and parameter specification, (3) solving discretized partial differential equation system, and (4) postprocessing simulated results. In this whole process, uncertainties can be introduced from various sources, e.g., reconstructed geometries, inflow/outflow boundary conditions, and specified mechanical properties, which largely impact the reliability of the simulated solutions [4–6]. Quantifying and reducing these uncertainties in computational hemodynamics have begun to draw attention in recent years [7–15]. Most of the existing literature has focused on forward uncertainty propagation, i.e., investigating the sensitivity of the simulated hemodynamics to the modeling inputs. For example, Sankaran and Marsden [14] developed an adaptive collocation polynomial chaos expansion (PCE) method with sparse grids to forward propagate input uncertainties on several cardiovascular modeling problems. Fleeter et al. [16] proposed a multifidelity stochastic framework to estimate propagated uncertainties for both global and local hemodynamic indicators. Guzzetti et al. [9] integrated a transverse enriched pipe element reduced order model into a PCE framework for efficient uncertainty quantification in large-scale hemodynamics models.
When observations of the system outcomes are available, which might be indirect, sparse, and noisy, the uncertainty of the modeling inputs and simulated states can be reduced within a Bayesian framework, known as inverse uncertainty quantification (UQ). The general idea is to describe the data under uncertainties in the form of likelihood functions and then compute the posterior distributions of unknown modeling inputs and simulated states of interest, given specified priors. Inverse UQ is more meaningful as it enables the assimilation of imperfect data in a probabilistic way for model inference and uncertainty reduction, which has attracted increasing attention in the cardiovascular hemodynamic community [17–20]. Although, there are many different ways to solve Bayesian data assimilation problems, e.g., variational inference [21], sequential or iterative Kalman inversion [18,19], ensemble average approaches [22], etc., they can only approximate the posterior statistics to certain extents. In order to accurately quantify the posterior distribution, Markov chain Monte Carlo (MCMC) sampling is still the gold standard for Bayesian computing and uncertainty quantification [23]. MCMC only requires pointwise evaluations of the posterior density up to its normalization constant, making the algorithm code nonintrusive and easy to implement. However, an intrinsic limitation of the MCMC algorithm lies in its inherent sequential nature and the requirement of a huge sample size. This limitation makes it computationally infeasible for applications involving expensive computer models, e.g., computational hemodynamic simulations, since the MCMC typically requires hundreds of thousands (or more) samples to reach the convergence, each of which requires a CFD forward simulation that has to be done sequentially.
To alleviate the computational burden, one way is to replace the expensive CFD model with an efficient surrogate, which can be constructed based on, e.g., radial basis function [24], Gaussian process (or Kriging) [25], generalized PCE [26], or multifidelity models [20]. However, traditional surrogate modeling techniques, e.g., Gaussian process, can only handle problems with moderate dimensions, and thus the surrogate maps are often built for predictions of global quantities instead of local hemodynamics (e.g., velocity and pressure fields). In this work, we adopt deep learning (DL) for surrogate modeling of local hemodynamics as it has been demonstrated to be capable of approximating high-dimensional operators accurately [27,28]. Although surrogate-based MCMC makes the Bayesian computing tractable, the approximation errors introduced by the surrogate may distort the computed posterior distribution. As an alternative, we leverage the DL-based surrogate modeling in a different way: instead of replacing the CFD with the surrogate emulator, we use the DL-based surrogate to accelerate the MCMC convergence by rejecting proposals at an early stage without running the expensive CFD simulations. The ideas of delayed acceptance have been explored previously in other contexts either using unconverged solutions [29,30] or GP-based emulators [31,32], showing asymptotically exact sampling behavior at the cost of potential speedup by requiring full-order model evaluation for each accepted sample. To further improve the efficiency, making MCMC sampling in parallel is desirable. In the past decade, many different parallelizable MCMC algorithms have been proposed, which can be classified into three categories: (1) embarrassingly parallel MCMC that splits the dataset into multiple partitions and sample the corresponding subposteriors in parallel [33–35], (2) multiple-proposal MCMC that proposes multiple samples simultaneously in every iteration while evaluates their densities in parallel [36–39], and (3) parallel-chain MCMC that runs multiple MCMC chains in parallel [40–43]. In this work, we formulated a DL-assisted delayed acceptance scheme within the parallel-chain MCMC framework based on an active learning strategy for geometric uncertainty reduction in computational hemodynamics. In particular, a deep neural network (DNN)-based surrogate model capturing local hemodynamics given geometric inputs is constructed and trained during MCMC sampling in an online manner. The trained DNN-surrogate will be used for the delayed acceptance to significantly improve the convergence. Moreover, multiple concurrent chains collaboratively provide sufficient training samples to actively improve the DNN surrogate during MCMC sampling, enabling a novel DL-assisted MCMC parallelism. The rest of the paper is organized as follows. The proposed DL-assisted parallel-chain MCMC algorithm is introduced in Sec. 2. Numerical results for geometric uncertainty reduction in aortic flows in two-dimensional (2D) irregular geometries are presented and discussed in Secs. 3 and 4. Finally, Sec. 5 concludes the paper.
2 Methodology
2.1 Overview.
We present a novel Bayesian MCMC computing approach assisted by active DL to estimate and reduce input uncertainties in computational hemodynamic simulations based on noisy velocity measurements. Here, we restrict the discussion to the impact of the uncertainty of the input geometry, which is usually due to imaging segmentation errors and has been demonstrated as the most dominant uncertainty source in the computational hemodynamics [44,45]. Specifically, the prior uncertainty is defined based on the parameterization of the geometric variations and will be updated to the posterior by assimilating noisy velocity data via Bayes rule using the MCMC sampling. We leverage a DL-based geometry-to-flow surrogate to facilitate the Metroplis-Hasting MCMC convergence by rejecting the proposed samples early to avoid a large number of expensive CFD simulations for the likelihood evaluations. The DL surrogate is actively trained (i.e., iteratively improved) using newly generated CFD data, gradually collected from multiple parallel MCMC chains in an online manner. Note that the major focus of this work is algorithmic development of the method that reduces the uncertainty under a Bayesian framework given sparse, noisy measurements instead of clinical applications of aortic flows. Hence, we demonstrate the feasibility of the algorithm on simplified 2D aorta geometry for proof-of-concept, which will be introduced in Sec. 2.2.
2.2 Parametric Two-Dimensional Aorta Model.
We use a 2D aorta model to demonstrate the proposed inverse geometric UQ method. The parameterization of the aorta geometry is shown in Fig. 1. To start with, a three-dimensional (3D) patient-specific aorta surface from Ref. [46] is reconstructed from medical images and preprocessed using the vascular modeling toolkit, a popular open-source module for 3D geometric reconstruction, analysis and mesh generation. For simplicity, the top branches such as subclavian and carotid arteries are trimmed off, leaving a single-channel vessel with its surface wall smoothed and outlet extended to facilitate convergence in CFD simulations. The centerline of the vessel is extracted from the aorta surface, and both the aorta surface and its centerline are projected to a flat plane, the orientation of which is determined by optimization to mostly preserve the side shape characteristics of the aorta. In particular, we take the centerline of the 3D aorta geometry and project points (m denotes the total number of points on the centerline) of the centerline to the normal vector n of the projection plane. The projected coordinates can be expressed as , where is regular Euclidean dot product. We minimize the variance of to obtain the optimized normal vector , which in this case, is . A sequence of 11 equally spaced locations is picked on the centerline along the streamwise direction, including the inlet to the outlet, where normal vectors (perpendicular to the tangent direction) are calculated. The geometric space is spanned by the centerline and radii at these locations. Hence, given a centerline C and a set of finite radius values , one can easily obtain a new aorta geometry, where the surfaces are obtained based on the spline function with a specified resolution. To model the geometric uncertainty, we add a perturbation of the radius as a function of the centerline locations. We design a perturbation method that mimics the shape of an aorta with stenosis or aneurysm at the sixth location of the aorta. We set the degree-of-freedom (number of elements) of the perturbation vector as three, which is sufficient enough for this designing purpose and higher numbers are not considered because they reduce the efficiency of the MCMC algorithm. We assign one DOF to the sixth location to enable sharp curvature changes. Then, we correlate the deformation of the upstream locations by perturbing them together in a sense of maintaining the topology of the whole upstream section during deformation, which can avoid creating unrealistic aorta shapes (e.g., extremely wavy vessel walls). Same for the downstream locations. Finally, we define a three-element perturbation vector , which are applied to the first five (in blue), the sixth (in green), and the rest radii (in purple), respectively. Each component of is drawn from a uniform distribution as the prior, where denotes the variation limit.
To systematically generate a significant amount of CFD data for training and validation purposes, we build a python routine that fully automates the tedious simulation procedure, including shape construction (mentioned above), mesh generation, boundary settings, simulation, and postprocessing. Specifically, we develop an in-house mesh generation software based on a constrained Delaunay triangulation algorithm [47]. The triangulated mesh will be automatically generated given the outline of the aorta shape, where the boundary mesh is obtained by extruding the internal ones along the normal direction with one unit length. The generated vtk mesh files are converted into the polyMesh format, which can be used for CFD simulations in openfoam, an open-source CFD platform. The incompressible, steady Navier–Stokes equations are solved based on the semi-implicit method for pressure linked equations algorithm [48]. A parabolic velocity profile is prescribed at the inlet with a maximum velocity , constant pressure outflow boundary condition is imposed at the outlet, and nonslip wall boundary condition is applied to the aorta walls, assumed to be rigid. A python wrapper is developed to enable the automation of CFD simulations in parallel. During the postprocessing step, the simulated hemodynamic field data are projected onto the surface or volumetric mesh grids, where the correspondences are established by uniformly sampling points between the centerline and the surface vertices.
2.3 Fundamentals of MCMC Bayesian Sampling.
where represents the state-to-observable map and σd denotes the observation uncertainty. The computational cost of each forward model evaluation (i.e., CFD simulation) is nontrivial and can be significantly large in many cases (e.g., 3D hemodynamics with flexible walls). It is computationally infeasible to conduct enormous forward simulations for MCMC, especially considering the sequential nature of the classic Metropolis-Hasting algorithm, making the matter worse.
2.4 Delayed-Acceptance Via Deep Neural Network Surrogate.
where and are prediction and reference on N vertices/grids.
NN type | Layers | Characteristics |
---|---|---|
Linear layer | Input feature = 3 input feature = 8 | |
Relu | — | |
Linear layer | Input feature = 8 input feature = 256 | |
MLP | Relu | — |
Linear layer | Input feature = 256 input feature = 128 | |
Relu | — | |
Linear layer | Input feature = 128 input feature = 20 |
NN type | Layers | Characteristics |
---|---|---|
Linear layer | Input feature = 3 input feature = 8 | |
Relu | — | |
Linear layer | Input feature = 8 input feature = 256 | |
MLP | Relu | — |
Linear layer | Input feature = 256 input feature = 128 | |
Relu | — | |
Linear layer | Input feature = 128 input feature = 20 |
2.5 Active Learning With Parallel MCMC Chains.
The DA strategy may fail when there is a large discrepancy between the surrogate and FOM predictions due to a significant decrease in the acceptance ratio. On the other hand, the FOM will be evaluated anyway in the DA-MCMC process for the proposed samples accepted by the surrogate model. Therefore, the natural idea is to leverage these gradually accumulated FOM solutions as data to train the DL surrogate online, known as the active learning. Specifically, during each MCMC iteration, when a new proposal passes the first step of surrogate screening, one FOM propagation will be executed, producing one high-fidelity (HF) flow data. These newly generated HF CFD solutions will be collected to enrich existing training and testing sets. The current DL surrogate will be tested on the constantly updated test set as long as a new group of Nnew data are collected. If the testing error exceeds the threshold of , the newly collected data will be added into the existing training set, which will be used to refine the DL surrogate model actively. Note that the neural network does not need to be retrained from scratch. Instead, a transfer learning strategy can be used to enable fast online model refinement. For example, the trained parameters of the current MLP are saved and can be reloaded as the initial condition or even frozen to a certain extent when the surrogate model needs to be further retrained due to the addition of training data. In this way, the training cost of the online update is extensively reduced. The details of the proposed algorithm and pseudocode are given in Algorithm 1 .
Active online training brings two major merits: (i) during initialization of the DL surrogate, data are collected uniformly over the entire parameter space, which may result in insufficiency near high-density regions. However, online training data collected from MCMC iterations are more likely to locate in high-density regions, and thus the surrogate can be refined in the area of interest. (ii) When multiple MCMC chains run simultaneously (i.e., parallel-chain MCMC), it brings no additional merits other than leveraging multiple CPU cores for classic Metropolis-Hastings or its DA variants. However, active learning-based DA Metropolis-Hastings algorithms can further leverage the parallel-chain setting, where a single surrogate can be updated over the new data collected across multiple chains, facilitating surrogate model update and convergence. Suppose the posterior distribution has more than one mode, and the parallel MCMC chains may move in different high-density regions simultaneously. The surrogate model will quickly improve over multiple high-density regions, resulting in a vast improvement in the UQ performance.
3 Numerical Results
3.1 Surrogate Computational Fluid Dynamics Modeling of Two-Dimensional Aortic Flows.
The variation range of each input shape parameter is normalized to and a uniform distribution is given as the prior (i.e., ). In order to fully explore the input parameter space, we draw 10,000 samples and create corresponding aorta meshes and solution fields using the developed python routine. Starting from the shape parameter , each python process, including construction outline, meshing, simulation, and postprocessing, takes around 6 s on a single core on IntelI XII Gold 6138 CPU. The dataset is generated by conducting a batch of 40 cases in parallel at one time. After the generation of the dataset, the velocity field data is projected into latent space using PCA. The fitted PCA transfer function is saved and serves as a transformation tool between physical and the encoded velocity data. Utilizing Ray-tune tool, the training and MLP hyperparameters are fully optimized. Subsequently, the MLP is trained on training sets of different sizes and tested on a test set of 100 unseen samples to investigate the sensitivity of the surrogate testing error with respect to training size. Each training process is conducted on an Nvidia RTX A6000 graphics processing unit with a minimum of 1500 epochs to ensure convergence.
The model testing results are shown in Fig. 2, where ϵV and ϵV M represent the relative mean square error of the velocity vector field and the velocity magnitude field. As expected, the results show a decrease for both errors as the dataset size grows. The error of the predicted velocity vector is slightly higher than that of the velocity magnitude prediction. The DL model trained on the entire CFD dataset (N = 10,000) has a test error of . In percentage, the error equals 0.005% in terms of average relative mean square error (ARMSE) of the velocity magnitude, which is very small, indicating a great consistency between the surrogate model and the CFD model. The velocity contours are shown in Fig. 3, where we randomly select five test aorta geometries, and the hemodynamic fields predicted by the DL model are compared with the CFD reference (referred to as ground truth). From comparison, one can find out that the predicted velocity contours agree well with the CFD ground truth across all selected samples.
Quantitively, we also compare the velocity profiles over cross section at all 11 locations between the predictions and the ground truth in Fig. 3. As expected, the predicted velocity profiles agree well to the ground truth for all five samples at 11 locations. Small discrepancies can be found at the sixth–eighth locations, where the sharp curvature of the aorta shape induces large gradient flows. The boundary layer at location id 8 is less accurate, which may affects wall shear stress calculations. This can be improved by adding finer boundary layer meshes which will be addressed in the more realistic CFD models in future work. We also calculated the velocity magnitude ARMSE for all the five examples as shown in Table 2. The result shows maximum discrepancy near the sixth location as expected. Note that the error at the inlet is zero because the inlet velocity is already prescribed. Although each FOM CFD simulation is relatively cheap, it still poses a great computational challenge to systematical analysis of MCMC algorithms in different settings. Since the DL model fully trained on the entire CFD dataset (N = 10,000) yields close results to the CFD ground truth, we will use it as the synthetic “CFD model” (i.e., FOM) for subsequent MCMC experiments.
3.2 Geometric Uncertainty Propagation and Reduction.
Each MCMC simulation is performed with more than 10,000 samples accepted to ensure a low sampling error. The proposal distribution for every MCMC trial is a multivariate normal distribution over three shape parameters with no correlation structures imposed. The proposal variance is set as 0.1 for all three dimensions such that good mixing is observed. The velocity magnitude information on 50 uniformly sampled locations is “observed” independently with 5% Gaussian noises. The observation data are synthesized from the CFD simulation with a specified “true” shape , assumed unknown in the MCMC inference process. The likelihood function is computed by taking the product of the conditional density of all observations given . The classic Metropolis-Hasting algorithm coupled with synthetic FOM is used to generate converged MCMC chains, which is used as the reference (ground truth) results for all MCMC experiments. The proposed method, delayed acceptance Metropolis-Hasting MCMC with online training (), is run with the same settings and will be compared against the reference results. In the proposed method, the initial surrogate model is trained on the set with the size of 10, which has a large prediction error (Fig. 2). Two chains are run in parallel on two different CPU cores and the online update interval dt is ten steps. The testing error threshold is set to be . Note that every online refinement of the DL surrogate is performed on the graphics processing unit in parallel to the main sampling process. Also, benefiting from the continuous transfer learning setup, each update is set to run only for 100 epochs, much less than the 1500 epochs for the initial training from scratch. Finally, the online update stopped at the 80th refinement step, where the test error falls below the threshold . More detailed statistics of the proposed Metropolis-Hastings MCMC algorithm is plotted in Fig. 4 in Appendix C. The posterior sampling results obtained by the proposed method are compared against the ground truth in Fig. 5. To show how the geometric uncertainty is reduced, we present the prior distributions of all three shape parameters in the first row, whereas the posterior distributions are plotted in the second row. Note that all sampled distributions are fitted by a Gaussian mixture density via sklearn Gaussian mixture module with the same hyperparameter setting. Moreover, 1000 randomly selected aorta shape samples are plotted at the rightmost column in Fig. 5, where the black line is the mean shape. The noninformative uniform prior distributions over the three parameters are updated to the posterior ones, which agree well with the ground truth solutions. Compared to the prior shape ensemble, the uncertainty range of the posterior ensemble is very small and concentrated to the synthetic truth, showing that the true aorta shape can be inferred with a significant reduction of uncertainty. The good agreement between the posterior distributions obtained by the proposed and ground truth (GT) demonstrates the effectiveness of the proposed method. However, slight discrepancies can be seen in the distributions of the second parameter θ2, which corresponds to the perturbation of the sixth radius. This might be because the curvature is rapidly changed near the location of the sixth radius, posing challenges in surrogate modeling.
In addition to the input shape uncertainty, we also investigate the posterior distributions of the maximum and mean velocity magnitudes (Vmax and Vmean) by propagating the geometric uncertainty forward to flow via the FOM simulations. The propagated prior and posterior distributions of Vmax and Vmean are shown in Fig. 6. We observed that the noninformative prior is more spread out for both maximum and mean velocity magnitudes, while their posteriors are significantly concentrated to 1.2 m/s and 0.96 m/s, respectively, indicating a notable uncertainty reduction. The density contraction is very obvious for the maximum velocity, implying the fact that a large proportion of the CFD samples has the maximum velocity near 1.2 m/s. Again, the posterior distributions obtained by the proposed perfectly agree with reference using FOM. A collection of five randomly selected aorta shapes from prior and posterior distributions are shown in Fig. 7, where the velocity magnitude contours are also plotted.
The top row shows five instances from the prior ensemble, while the collection of the posterior samples from is listed at the bottom. The variation of the velocity field is significantly diminished compared to the prior instances. In addition, we added the velocity profile comparisons between the prior and posterior ensemble and the ground truth () in Fig. 7. The results show a large variance of velocity profiles for the prior ensemble, whereas the posterior profiles are much more consistent and closer to the ground truth, indicating significant uncertainty reduction by the inverse UQ process. We also compare the ARMSE of the velocity magnitude on 11 locations along the centerline between the ensembles (prior and posterior) to the ground truth (Table 3). The result shows that the average differences between the prior ensemble and the ground truth are larger than that between the posterior ensemble and the ground truth. In other words, the prior samples have very large uncertainties at 11 locations whereas the posterior samples have much less uncertainty and high similarities to the ground truth. In general, the inverse UQ process can increase the reliability of the velocity field prediction and hence facilitate CFD-supported clinical diagnosis of cardiovascular diseases.
Locations (no.) | 1 (Inlet) | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 (Outlet) |
---|---|---|---|---|---|---|---|---|---|---|---|
Prior ARMSE | 0% | 0.064% | 0.18% | 0.32% | 6.1% | 9.7% | 8.2% | 7.7% | 7.8% | 8.1% | 8.2% |
Posterior ARMSE | 0% | 0.0076% | 0.0067% | 0.0079% | 0.010% | 0.030% | 0.035% | 0.023% | 0.016% | 0.023% | 0.030% |
Locations (no.) | 1 (Inlet) | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 (Outlet) |
---|---|---|---|---|---|---|---|---|---|---|---|
Prior ARMSE | 0% | 0.064% | 0.18% | 0.32% | 6.1% | 9.7% | 8.2% | 7.7% | 7.8% | 8.1% | 8.2% |
Posterior ARMSE | 0% | 0.0076% | 0.0067% | 0.0079% | 0.010% | 0.030% | 0.035% | 0.023% | 0.016% | 0.023% | 0.030% |
4 Discussion
4.1 Performance Comparison of Different MCMC Methods.
This section will mainly compare and discuss the performance of DL-assisted Metropolis-Hastings MCMC with different settings. Besides the reference MCMC (GT) and the method, two other MCMC algorithms are added to the comparison: one is the classic Metropolis-Hasting with a low-fidelity surrogate model trained by ten samples (LF MCMC), and the other is a DA Metropolis-Hastings with the same LF surrogate model without an online update (DA). Figure 8 shows the posterior distributions for the shape parameter and the maximum velocity Vmax obtained by different MCMC methods. In terms of the posterior mean, a straightforward observation is that the LF MCMC mean deviates from the ground truth to a large extent, especially for the shape parameters, whereas the Metropolis-Hastings algorithms featuring DA always accurately capture the posterior mean. Note that the detailed balance is not satisfied for the LF MCMC method, which results in an erroneous posterior distribution approximation. In contrast, the MCMC method featuring delayed acceptance satisfies the detailed balance, accounting for a much more accurate prediction of the mean. Viewing from the posterior shape, the proposed method stands out among all the methods. The DA MCMC method tends to heavily over-predict the posterior peak for shape parameters. Obviously, the presence of the LF filtering step alters the trace of the states and consequently results in improper sample coverage over the high-density region. For the velocity density plots, however, both the DA and methods are very close to the reference (GT).
4.2 Efficiency of Active-Learning Delayed-Acceptance MCMC.
Apparently, for the GT and the LF MCMC methods, EAR = A (defined in Eq. (2)). Whereas (defined in Eq. (5)) the DA and MCMC methods. The effective acceptance ratios are plotted in Fig. 9 (left). As expected, the LF MCMC method has the same level of EAR compared to the ground truth. However, the DA MCMC method has a surprisingly low acceptance ratio when a LF surrogate is used, which means the required FOM queries is extensively higher than that of the classic Metropolis MCMC. In fact, the intention of the delayed acceptance technique is to improve the effective acceptance ratio (second stage) by introducing a surrogate screening step (first stage). However, the addition of the screening step may backfire when the surrogate model's prediction error is significant. Active learning can come into play to timely improve the LF surrogate by leveraging gradually collected FOM solutions to address this issue. As shown in Fig. 9 (left), the proposed method raises the EAR to 45.6%, which is about five times higher than the reference MCMC. Note that the EAR will increase if the accuracy of the surrogate model grows and will asymptotically reach 100% if the surrogate has the same accuracy as the FOM.
We further evaluate the extent of efficiency improvement of the proposed method at different ratios of cost between the FOM and surrogate forward models. A normalized cost is defined which represents the total cost of the MCMC simulation normalized by the cost of executing FOM queries for the accepted samples. The relationship of those two variables is demonstrated in the right panel of Fig. 9. The result indicates the efficiency of the proposed MCMC method will be significantly improved over the reference as the FOM becomes more and more expensive. On the other hand, the cost of the MCMC method can be higher than the reference MCMC method when the surrogate model's cost is close to the FOM due to two-stage acceptance/rejection determination. The intersection of two curves is located at . Usually, for DL-based surrogate models for complex computational hemodynamics, the value is much higher than the critical value due to the extremely fast inference speed of the neural network and high cost of 3D CFD simulations. For example, the surrogate model and the CFD forward model cost about 0.006 and 6 s in the current pipeline, resulting in a high to low fidelity model ratio of . Suppose the MCMC has to collect 10,000 samples to be reliable, then a standard MCMC with CFD forward model requires 221.6 h, whereas the MCMC_OLT method costs about 37 h, resulting in a reduction in time of 83.3%. Note that in this work, we adopted the HF model as the neural network with 10,000 training data, so the HF model costs 0.006 as well and the actual time costs in this work are 13.3 min and 27.4 min for classic MCMC and the propose DA_OLT, respectively. Assuming in a more realistic application where the forward model costs 6 min and the surrogate model's cost stays the same, the computational costs for classic MCMC and proposed DA_OLT method are 13,295 and 2193 h, respectively, bringing an efficiency boost of 83.5%. In conclusion, our proposed method will bring a significant efficiency boost compared to the conventional Metropolis-Hastings MCMC method for forward and inverse UQ problems.
4.3 Limitations for Patient-Specific Applications.
In the previous results, we demonstrate the feasibility of the algorithm using synthetic data on simplified 2D aorta geometries as an example for proof of concept. However, it is noteworthy to discuss whether the method can be extended to patient-specific hemodynamic applications. Admittedly, the CFD simulation in the present work is a simplified numerical example and has not considered many realistic aspects for real patient-specific cases, e.g., 3D patient-specific geometry, measured pulsatile inflow boundary conditions, fluid–structure interactions, etc. Hence, extending the proposed algorithm to patient-specific settings requires several improvements: (1) the CFD forward model can be updated to patient-specific settings, e.g., setting the CFD simulation to be 3D and transient, assigning measure pulsatile inlet flow boundary conditions based on clinical measurements, adding Windkessel boundary conditions at the outlet, enabling fluid–structure interactions, etc. (2) Use a more sophisticated parameterization method to describe complex 3D geometries (e.g., 3D aorta with branches). (3) Use a more sophisticated neural network (e.g., graph neural network) to learn the mapping between 3D input geometry and field outputs. In future work, we expect to address those issues to apply the proposed algorithm to clinical usage. In addition, in the context of patient-specific hemodynamics simulation, information such as flowrate data obtained from 2D PC MRI at certain cross section or 3D/4D flow MRI data can be assimilated using the proposed method to reduce the uncertainty of the geometries of interest and enhance the model predictive accuracy. One can also evaluate the similarity between the predicted aorta geometry shape and the flow field to direct image data (e.g., 4D flow MRI).
5 Conclusion
This work presents a Bayesian framework for geometric uncertainty reduction using DL-assisted MCMC sampling. First, shape parameterization is performed on the aorta geometries based on a patient-specific 3D aorta sample, from which multiple aorta geometries are created by perturbation of radii at different sections. An automatic python routine is established to encapsulate all necessary simulation procedures to simulate flow information from a given aorta shape . The geometry-to-flow surrogate model is built upon the CFD dataset to learn the nonlinear relationship between the input shape to the flow solution fields. The trained surrogate model exhibits an increase of accuracy as the size of the training set grows. We propose a Metropolis-Hastings algorithm featuring delayed acceptance and active learning (), enabling the inference of the aorta shape and uncertainty reduction based on observed velocity information at sparse locations. The results show a significant uncertainty reduction given a noninformative prior. A good consistency is observed between the proposed method and the reference Metropolis-Hasting MCMC (GT MCMC) in terms of the posterior approximation. The proposed method is compared with delayed acceptance Metropolis-Hastings MCMC (DA MCMC) and standard Metropolis-Hastings MCMC algorithms equipped with the same surrogate model without an active learning component (LF MCMC). Regarding accuracy, the proposed method stands out among those methods, while the LF MCMC method completely fails due to high bias of the surrogate. As for the efficiency, again, the proposed method brings a huge efficiency improvement, whereas the standard DA Metropolis-Hastings MCMC method failed in this specific setting due to the large discrepancy between the surrogate and FOM solutions. The cost of the proposed method is further analyzed by inspecting the normalized cost change as a function of the HF-to-LF model cost ratio. It appears that the efficiency boost is more conspicuous when the FOM is more expensive. Typically, when a DNN-bases surrogate model is embedded into the proposed DA scheme with active learning, a considerable promotion of efficiency is guaranteed.
In general, this work focuses on the algorithmic development of a method to reduce the uncertainty of reconstructed geometry via assimilating sparse, noisy flow measurement data using a Bayesian framework. The major highlight of the algorithm is that it improves the UQ process, where only one version of the forward model (either full-order or reduced-order) is repetitively evaluated, and enables combining models with different levels of complexity into one UQ algorithm. In addition, the active-learning feature further reduces the cost of training data generation via selectively collecting the data in an online manner and gains an extra efficiency boost in parallel MCMC settings. Achieving patient-specific settings for the CFD forward model is out of the scope of this paper. The current pipeline is subjected to multiple limitations for its application in clinical situations, where the forward model is usually patient-specific. As a result, certain improvements have to be made for clinical applications (e.g., using patient-specific settings for the forward model, updating the parameterization for complex 3D geometry and improving the neural network architecture to learn more complicated mappings between 3D geometry and the field outputs), which will be addressed in future work.
Funding Data
National Science Foundation, United States of America (Award Nos. CMMI-1934300 and OAC-2047127; Funder ID: 10.13039/100000001).
Air Force Office of Scientific Research (AFOSR), United States of America (Award No. FA9550-22-1-0065; Funder ID: 10.13039/100000181).
College of Engineering at University of Notre Dame (Funder ID: 10.13039/100008109).
Optimized Deep Neural Network Architecture
Proposed DL-Assisted Parallel-Chain MCMC Algorithm
Data: Staring point , test error threshold , online update period dt; online dataset , initial accepted samples set |
Result: accepted sample set AS |
for each MCMC iterItion i = 1, 2,… do |
Draw next state ; calculate surrogate model posterior and acceptance ratio ; Generate a random number ; |
ifthen |
Calculate high-fidelity model posterior and acceptance ratio ; |
Generate a random number ; |
Append sample and its solution to online dataset OLT; |
ifthen |
Accept the new sample ; |
Append sample to accepted sample set AS; |
else |
Reject sample |
end |
else |
Reject sample |
end |
ifthen |
test the surrogate model on the dataset OLT; if test error then |
add new data to dataset OLT, train surrogate model |
else |
stop the training update |
end |
end |
if training is finished then |
update surrogate model |
end |
end |
Data: Staring point , test error threshold , online update period dt; online dataset , initial accepted samples set |
Result: accepted sample set AS |
for each MCMC iterItion i = 1, 2,… do |
Draw next state ; calculate surrogate model posterior and acceptance ratio ; Generate a random number ; |
ifthen |
Calculate high-fidelity model posterior and acceptance ratio ; |
Generate a random number ; |
Append sample and its solution to online dataset OLT; |
ifthen |
Accept the new sample ; |
Append sample to accepted sample set AS; |
else |
Reject sample |
end |
else |
Reject sample |
end |
ifthen |
test the surrogate model on the dataset OLT; if test error then |
add new data to dataset OLT, train surrogate model |
else |
stop the training update |
end |
end |
if training is finished then |
update surrogate model |
end |
end |