Sometime ago I wrote a short compilation of facts that explain why Computational Fluid Dynamics is a hard field from a research perspective and also when considering industry applications. I outlined how the parameterisation of turbulence models makes mathematical analysis hard, among other topics that add complexity to CFD.
Recently, that article received some interesting feedback from a CFD focus group on LinkedIn. Many commented on the article and nearly all participants agreed: CFD is hard because validation is hard (If you’d like to see the thread of comments, click here). Now why is it that validation is hard for Computational Fluid Dynamics technology?
To set up a CFD simulation we use data that describes the geometry of the model, the material properties of the fluid, and the environment in which the model is placed. This data may come from a designer, a database or a field experiment, for example. A common feature of this data is that it is uncertain, meaning that each data set is only known up to a finite precision, possibly under a known statistical variation.
To perform a CFD simulation we need to select samples from the data, to setup a combination of parameters that represent one possible configuration within the restrictions posed by the uncertainty in the data. To estimate the effect this uncertainty in data has on the CFD simulation is referred to as “uncertainty quantification”.
A CFD simulation is only a small isolated part of reality, where boundary conditions are used to model the exterior world which is not part of the simulation. A boundary conditions is not exact; it is just an approximation. The approximation may stem from uncertainty in data, but also from the mathematical model and the numerical method used in the CFD simulation. But what is the impact of these approximations? The propagation of model and numerical approximation errors can be analysed by mathematical methods and can also be approached as an uncertainty quantification problem.
Using a simulation model with enough free parameters may enable you to adjust the model so that the CFD simulation fits the experimental data. But this can lead to overfitting of the model that cannot be generalised to other sets of experimental data, which limits the use of the model in to situations where the result to a degree is already known, not uncommon e.g. for RANS models. In order to trust a CFD simulation for a new problem without validation data, you need to be able to generalise your model without recalibrating the model parameters – quoting John von Neumann in this thought-provoking read.
The cover image in this post is from a benchmark proposed by the Architectural Institute of Japan (AIJ) for validation of CFD codes. Read more here.
CFD simulations typically provide spatial data at a much higher resolution than physical experiments. Often only average quantities sampled at certain spatial points can be measured in an experiment. How can you trust the parts of the simulation results that you cannot directly validate against physical measurements? Here you need to be able trust the mathematical model and the numerical method upon which your CFD tool is built.
An overfitted model cannot be trusted other than in the data points to which it is calibrated. On the other hand, if you have a model based on established principles of physics, a well-posed mathematical formulation and a stable and accurate numerical method, you can put much more trust in your CFD simulation.
RANS is used to simulate a statistical average of turbulent flow, using a parameterisation of unresolved turbulent scales, parameters that may result in overfitting of RANS models. LES relies on a filter width together with subgrid models, less vulnerable to overfitting, but also resolves chaotic turbulent scales since each simulation represents a sample trajectory (on the filter scale), not a statistical average. DNS resolves all turbulent scales, so that each simulation represents a sample trajectory on the smallest physical scale.
In physical experiments the statistical nature of turbulence is handled by running the experiment over a long period of time, by using the assumption of ergodicity by which you can compute statistical averages through time averages. Often this approach is too expensive in a CFD simulation where simulation costs scale linearly with the simulation time. Therefore, we need to keep in mind the nature of the different approaches to turbulence in CFD simulations. Are we approximating a trajectory or a statistical average?
I will end by giving the same advice I shared in my original article. Acknowledging all complexities of the field should not demotivate anyone to get into the field. We still need to make progress at the frontiers of mathematics, computer science and engineering. And we still need to find new ways to share that technology with non-experts. That has been my mission as a researcher, and it is exciting to be part of the progress toward better and more democratic CFD solutions at Ingrid Cloud.