A truncated Gaussian filter for data assimilation with inequality constraints: Application to the hydrostatic stability condition in ocean models
Introduction
Linear estimation methods (like the Kalman filter) are either defined as the linear estimator minimizing the variance of the estimation error or based on the stronger assumption that the error probability distributions are Gaussian. This last assumption is indeed a sufficient condition for the error variance minimizing estimator to be linear and equal to the maximum probability estimator. Problems can occur if such linear methods are used to estimate a vector of variables subjected to inequality constraints. In such cases, the Gaussian assumption is not valid, and nothing prevents the best linear estimation from violating the inequality constraints. The information contained in the inequality constraints is not taken into account in the estimation process.
In many data assimilation problems, however, the state variables are subjected to inequality constraints that must be taken into account in the assimilation process because: (i) they can be an important source of information about the system, (ii) they are often a prerequisite for a valid model initial condition. For instance, in ocean systems, gravitational stability means that the Brunt-Väisälä frequency (where is the potential density referenced to the local in situ pressure) remains positive. This condition often contains valuable information that can improve the accuracy of the estimation. The reason is that in the ocean there is most often a top layer (the mixed layer) that is marginally stable, so that the ocean state is usually close to a number of hydrostatic stability inequality constraints. Moreover, in isopycnic coordinate ocean models (representing the ocean state as a stack of layers of prescribed density, e.g. Bleck, 2002), the hydrostatic stability condition is expressed in terms of a constraint of non-negativity of the layer thicknesses. Estimating an ocean state with non-negative layer thicknesses is obviously a prerequisite for initializing a model forecast. Ocean models are also subjected to other kinds of inequality constraints: non-negativity of tracer concentration, sea-water temperature above freezing point,… For these reasons, several solutions have been proposed in the literature for including inequality constraints in existing linear data assimilation schemes.
First, inequality constraints were introduced in variational data assimilation methods by adding non-quadratic terms (i.e. with a nonlinear gradient) in the cost function to be minimized (e.g. Fujii et al., 2005). However, this amounts to changing the nature of the problem, replacing the strong constraints by weak constraints. Moreover, this solution requires using a variational algorithm to determine the minimum of a non-quadratic cost function.
A second way of dealing with inequality constraints is to make a nonlinear change of variables (anamorphosis) that transforms the constrained variables to unconstrained variables (Bertino et al., 2002). For instance, for a non-negative variable (like a concentration or a layer thickness), one possible solution is to use a logarithmic transformation (e.g. ) and assume a Gaussian distribution in the logarithmic space. The problem with this method is that it is not easy to find the change of variables that leads to a reasonable assumption. If we assume, for example, that is distributed like , the mean corresponds to a unitary distance with respect to the constraint , the left tail of the Gaussian corresponds to probability tending to zero close to the constraint (typically or ), and the right tail of the Gaussian corresponds to probability tending to zero far from the constraint (typically or ). The resulting assumption for is therefore zero probability close to the constraint , and significant probability for very large values . We will see that such features are not very appropriate for the application considered in this paper.
A third solution proposed in the literature for including inequality constraints in linear schemes is to apply the linear scheme as it is and to introduce an adjustment operator to restore the constraints whenever needed. In their assimilation system, Brankart et al. (2003) introduced an adjustment operator (see Section 5) to restore non-negative layer thicknesses in an isopycnic ocean model (MICOM) after the observational update. More recently, in a hybrid coordinate ocean model (HYCOM), Thacker (2007) proposed using the correlations contained in the background error covariance matrix to move the ocean state from the linear solution to an adjusted solution with non-negative layer thicknesses. His solution is similar to that proposed by Simon and Simon (2005) who solved a quadratic programming problem to compute the adjusted solution (see Section 2.6.1). However, the difficulty with these adjustment operations is that they are based on heuristic considerations, and are only used to compensate for problems resulting from the linearity assumption.
Apart from these transformed linear schemes, there is also the possibility of using more general nonlinear estimation methods like the particle filters (e.g. Van Leeuwen, 2003, Losa et al., 2003, Vossepoel and Van Leeuwen, 2007). Since there is no hypothesis about the shape of prior probability distributions (approximated by an ensemble of model states), the particle filter can potentially deal with any kind of problem and, among other things, solves the problem of dealing with inequality constraints. However, describing a general distribution with sufficient accuracy requires using large ensembles, which involves numerical costs that can become prohibitive for practical ocean data assimilation problems.
In this paper, we choose to develop an optimal nonlinear scheme like the particle filter, with the difference that one prior assumption is made about the shape of the probability distributions: we assume that they are truncated Gaussian distributions (i.e. basically Gaussian distributions truncated by the inequality constraints). Consequently, the resulting estimation will be nonlinear in the input data and the maximum probability estimator will become different from the error variance minimizing estimator. A side effect of these developments is that we will be able to interpret the solution proposed by Simon and Simon, 2005, Thacker, 2007 as an approximation of the optimal scheme (in the maximum probability estimator version). In this way, this study can also be seen as an attempt to provide a more precise theoretical basis for the adjustment operations proposed in these studies. These theoretical aspects are presented in Section 2 of this paper.
With regard to applications of the method, we will seek to determine whether it is able to deal adequately with the condition of hydrostatic stability in ocean analyses, using twin experiments (i) with a one-dimensional z-coordinate model of the ocean mixed layer (Sections 3 Application to a z-coordinate mixed layer model, 4 Observational update of a truncated Gaussian pdf: sensitivity studies) and (ii) with a three-dimensional hybrid coordinate model of the Bay of Biscay (Section 5). With the one-dimensional model, we will be able to describe in detail the parameterization of the probability distributions (Section 3) and the observational update of truncated Gaussian distributions (Section 4), while, with the three-dimensional model, only synthetic results will be presented (Section 5).
Section snippets
Methodology
The purpose of this section is to show that an optimal filter dealing with inequality constraints can be formulated, under the assumption that the error probability distributions are truncated Gaussian. For that purpose, we first give the definition and main properties of truncated Gaussian probability distributions (Section 2.1). Then we present the truncated Gaussian filter (TG filter), with the basic hypothesis sustaining it (Section 2.2). In the two following (Sections 2.3 Sampling
Description of the ocean mixed layer model
In order to illustrate the method presented in Section 2, we will apply it to a one dimension (vertical) mixed layer model developed and validated in Lacroix and Gregoire, 2002, Magri et al., 2005. The model is based on primitive equations, that are reduced to their vertical dimension, assuming horizontal homogeneity. The model state variables are temperature (T), salinity (S), zonal velocity (U), meridional velocity (V) and turbulent kinetic energy. Potential density is diagnosed from
Observational update of a truncated Gaussian pdf: sensitivity studies
The objective of this section is to apply the observational update presented in Section 2 to the case study described in Section 3. In Section 4.1, a first example is presented for one particular case study using a specific observation system and a given size of the error subspace, with the aim of describing in detail how the algorithm actually works. Section 4.2 compares several experiments with a view to analysing the effect of different types of observation system on the truncated Gaussian
Application to a hybrid coordinate model of the Bay of Biscay
After encouraging results obtained with the one-dimensional mixed layer model, the natural step forward is to test the TG filter with a real size system. In this section, it is applied to the problem described in Broquet et al. (2007). The model is a three-dimensional ocean circulation model of the Bay of Biscay at a 1/15° horizontal resolution ( grid points, including 33,815 wet points). It is based on the hybrid coordinate ocean model HYCOM (Bleck, 2002, Chassignet et al., 2003) and
Conclusion
In this paper, we have shown that it is possible to build an optimal nonlinear filter based on the assumption that error probability distributions are truncated Gaussian distributions. The justification for this assumption is that the resulting TG filter is numerically much more efficient than more general methods that do not make any assumption about the shape of the probability distributions (such as particle filters). It is especially more efficient because the observational update can be
Acknowledgements
This work was funded by the Centre National de la Recherche Scientifique (CNRS) and the U.S. Office of Naval Research (under Grant No. N00014-05-1-0731) as part of the NOPP HYCOM project. Additional support from the Service Hydrographique et Oceanographique de la Marine (SHOM) is gratefully acknowledged. We also wish to thank the anonymous reviewers for their useful comments and suggestions. The calculations were conducted with the support of IDRIS/CNRS.
References (28)
An oceanic general circulation model framed in hybrid isopycnic-cartesian coordinates
Ocean Modelling
(2002)- et al.
A survey of numerical methods for nonlinear filtering problems
Physica D
(2007) - et al.
Revisited ecosystem model (MODECOGeL) of the Ligurian Sea: seasonal and interannual variability due to atmospheric forcing
Journal of Marine Systems
(2002) - et al.
Sequential weak constraint parameter estimation in an ecosystem model
Journal of Marine Systems
(2003) - et al.
Data assimilation in a marine ecosystem model of the Ligurian Sea
C.R. Geoscience
(2005) Data assimilation with inequality constraints
Ocean Modelling
(2007)An ensemble adjustment Kalman filter for data assimilation
Monthly Weather Review
(2001)- et al.
Combining geostatistics and Kalman filtering for data assimilation in an estuarine system
Inverse Problems
(2002) - et al.
Implementation of a multivariate data assimilation scheme for isopycnic coordinate ocean models: application to a 1993–1996 hindcast of the North Atlantic Ocean circulation
Journal of Geophysical Research
(2003) - et al.
The SEEK filter method for data assimilation in oceanography: a synthesis
Ocean Dynamics
(2006)
Estimation of model errors generated by atmospheric forcings for ocean data assimilation: experiments in a regional model of the Bay of Biscay
Ocean Dynamics
North Atlantic simulations with the HYbrid Coordinate Ocean Model (HYCOM): impact of the vertical coordinate choice, reference pressure, and thermobaricity
Journal of Physical Oceanography
An introduction to estimation theory
Journal of the Meteorological Society of Japan
The ensemble Kalman filter: theoretical formulation and practical implementation
Ocean Dynamics
Cited by (35)
Stochastic estimation of biogeochemical parameters of a 3D ocean coupled physical-biogeochemical model: Twin experiments
2011, Journal of Marine SystemsA comparison of sequential assimilation schemes for ocean prediction with the HYbrid Coordinate Ocean Model (HYCOM): Twin experiments with static forecast error covariances
2011, Ocean ModellingCitation Excerpt :At present, all assimilation methods have a post-processing step after the assimilation procedure in which final corrections to the model layers are determined based on the analysis increments and constraints listed on Table B.1. Approaches to enforce these constraints as a part of the assimilation procedure by using inequality constraints and anamorphosis transformations (Thacker, 2007; Lauvernet et al., 2009; Simon and Bertino, 2009) have been proposed, and in future might eliminate these aspects from the post-processing step. Here, we examine the sensitivity of the results to the post-processing choices in Section 5.
Robust deconvolution for ARMAX models with Gaussian uncertainties
2010, Signal ProcessingData assimilation with soft constraints (DASC) through a generalized iterative ensemble smoother
2022, Computational GeosciencesA comparison of nonlinear extensions to the ensemble Kalman filter: Gaussian anamorphosis and two-step ensemble filters
2022, Computational Geosciences