Variational chemical data assimilation with approximate adjoints

Variational chemical data assimilation with approximate adjoints

Computers & Geosciences 40 (2012) 10–18 Contents lists available at ScienceDirect Computers & Geosciences journal homepage: www.elsevier.com/locate/...

3MB Sizes 0 Downloads 33 Views

Computers & Geosciences 40 (2012) 10–18

Contents lists available at ScienceDirect

Computers & Geosciences journal homepage: www.elsevier.com/locate/cageo

Variational chemical data assimilation with approximate adjoints Kumaresh Singh, Adrian Sandu  Computational Science Laboratory, Department of Computer Science, Virginia Polytechnic Institute, Blacksburg, VA 24060, USA

a r t i c l e i n f o

a b s t r a c t

Article history: Received 3 February 2011 Received in revised form 1 July 2011 Accepted 7 July 2011 Available online 26 July 2011

Data assimilation obtains improved estimates of the state of a physical system by combining imperfect model results with sparse and noisy observations of reality. In the four-dimensional variational (4D-Var) framework, data assimilation is formulated as an optimization problem, which is solved using gradientbased optimization methods. The 4D-Var gradient is obtained by forcing the adjoint model with observation increments. The construction of the adjoint model requires considerable development effort. Moreover, the computation time associated with the adjoint model is significant (typically, a multiple of the time needed to run the forward model). In this paper we investigate the use of approximate gradients in variational data assimilation. The approximate gradients need to be sufficiently accurate to ensure that the numerical optimization algorithm makes progress toward the maximum likelihood solution. The approximate gradients are obtained through simplified adjoint models; this decreases the adjoint development effort, and reduces the CPU time and the storage requirements associated with the computation of the 4D-Var gradient. The resulting approach, named quasi-4D-Var (Q4D-Var), is illustrated on a global chemical data assimilation problem using satellite observations and the GEOS-Chem chemical transport model. & 2011 Elsevier Ltd. All rights reserved.

Keywords: Data assimilation Simplified adjoint model

1. Introduction Chemical data assimilation is the process of optimally combining observations of reality with imperfect model predictions to produce a better estimate of the chemical state of the atmosphere. Data assimilation has been used to improve initial conditions, emissions, and boundary values. Besides the initial conditions, improvements in boundary values lead to improved air quality forecasts. Considerable experience with data assimilation have been accumulated in the field of numerical weather prediction (Daley, 1991; Kalnay, 2002; Navon, 2009). In this work we focus on chemical data assimilation, i.e., on assimilation of observations of pollutant levels in the atmosphere. Chemical data assimilation poses specific challenges related to the multiphysics nature of the system, the stiffness of chemical kinetic equations, the sparseness of chemical observations, and the uncertainty in the levels of anthropogenic and natural pollutants emitted into the atmosphere. The variational approach formulates data assimilation as a minimization problem of a cost functional that measures the model-observations mismatch. In the three-dimensional variational (3D-Var) approach the observations are treated in succession, while in the four-dimensional variational (4D-Var) approach

 Corresponding author.

E-mail addresses: [email protected] (K. Singh), [email protected] (A. Sandu). 0098-3004/$ - see front matter & 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.cageo.2011.07.003

all observations available within an assimilation time window are simultaneously used. The Kalman filter approach to data assimilation is rooted in statistical estimation theory and provides the analysis covariance together with the best state estimate. Suboptimal Kalman filters employ different approximations of the covariances to make the computations feasible with large models. The variational approach has been successfully applied to chemical data assimilation (Bei et al., 2008; Carmichael et al., 2008; Elbern and Schmidt, 2001; Khattatov et al., 2000; LeDimet and Talagrand, 1986; Sandu et al., 2005). In particular, 4D-Var assimilation has been used to adjust gas phase chemical tracer initial conditions (Chai et al., 2007; Zhang et al., 2008), pollutant emissions (Chai et al., 2009), and aerosol fields (Hakami et al., 2005; Henze et al., 2009). Suboptimal Kalman filters have been employed successfully for chemical data assimilation (Clark et al., 2006; Lamarque et al., 2002; Menard et al., 2000; Parrington et al., 2009; Pierce et al., 2007; Segers et al., 2005). The use of the ensemble Kalman filter (EnKF) in chemical data assimilation has been studied in Constantinescu et al. (2007a, 2007b). The relative merits of different data assimilation approaches applied to chemical transport modeling have been studied in Constantinescu et al. (2007b), Geer et al. (2006), and Wu et al. (2008). Among the variational techniques, the three-dimensional version (3D-Var) is easier to implement, as it does not involve the construction of model adjoints, and is computationally inexpensive to run. The four-dimensional variational method (4D-Var) is more accurate but requires an adjoint construction, and more computational time

K. Singh, A. Sandu / Computers & Geosciences 40 (2012) 10–18

11

and memory to run. Assimilation techniques that are more accurate than 3D-Var but less expensive than 4D-Var have the potential to provide important practical benefits. In this paper we discuss an assimilation strategy that is based on 4D-Var but uses inexact gradient information. We call this approach quasi-4D-Var (Q4D-Var). For the application studied here, global chemical data assimilation indicates that Q4D-Var is slightly more expensive than 3D-Var and that the quality of the Q4D-Var analyses is comparable to that of strongly constrained 4D-Var analyses. The difference between the model adjoint and the inexact gradients determines the accuracy of the approach. This difference typically increases with time, and therefore the quality of the Q4D-Var analysis is expected to deteriorate with increased assimilation window length. The quality of the Q4D-Var analysis is likely to depend also on the factors that impact the quality of the 4D-Var analysis, such as the density and quality of observations, and the model error. The paper is organized as follows. Section 2 reviews the variational approach and discusses the differences between 3D-Var and 4D-Var. Section 3 introduces the Q4D-Var approach and Section 4 assesses its performance on a global chemical data assimilation study using real satellite data. Conclusions and future work are presented in Section 5.

Based on these three sources of information, data assimilation computes the analysis state xA . The analysis estimation errors eA ¼ xA xt are characterized by the analysis error covariance matrix A A Rnn .

2. Variational data assimilation

2.2. Four-dimensional variational (4D-Var) data assimilation

Variational methods solve the data assimilation problem in an optimal control framework (Courtier and Talagrand, 1987; LeDimet and Talagrand, 1986; Lions, 1971). Specifically, one finds the control variable values (e.g., initial conditions) that minimize the discrepancy between model forecast and observations; the minimization is subject to the governing dynamic equations, which are imposed as strong constraints. In this discussion, for simplicity of presentation, we focus on discrete models where the initial conditions are the control variables. The true state of the system xt A Rn is unknown and needs to be estimated from the available information. The current best estimate xB of the true state is called the a priori, or the background state. The background estimation errors are typically assumed to have a normal probability density, eB ¼ xB xt A N ð0, BÞ, where B A Rnn is the background error covariance matrix. With many nonlinear models this normality assumption is difficult to justify but is nevertheless widely used because of its convenience. A model M encapsulates our knowledge about physical laws that govern the evolution of the system. The model evolves an initial state x0 A Rn at the initial time t0 to future state values xi A Rn at future times ti,

In strongly constrained 4D-Var data assimilation, all observations (2) at all times t1 , . . . ,tN are considered simultaneously. The control parameters are the initial conditions x0 ; they uniquely determine the state of the system at all future times via the model equation (1). The background state is the prior value of the initial conditions xB0 . The 4D-Var analysis is the solution of the following minimization problem, constrained by the model equations

xi ¼ Mt0 -ti x0 :

ð1Þ

Observations yi A Rm of the true state are taken at times ti, i ¼ 1, . . . ,N. The model state and the observations are linked by the relation yi ¼ Hðxi Þeobs i :

ð2Þ

The observation operator H maps the state space onto the observation space. The observation error term eobs accounts for i both the measurement instrument errors as well as representativeness errors (i.e., errors in the accuracy with which the model can reproduce reality). Typically observation errors are assumed to be unbiased and normally distributed,

eobs A N ð0, Ri Þ, i ¼ 1, . . . ,N: i

ð3Þ obs i

Moreover, observation errors at different times (e i aj) are assumed to be independent.

obs j

and e

for

2.1. Three-dimensional variational (3D-Var) data assimilation In 3D-Var data assimilation the observations (2) are accounted for sequentially. The model (1) advances the state to time ti; this forecast provides the background state xBi . An analysis state is obtained using the observations yi as the minimizer b ðxi Þ, xAi ¼ arg min J

ð4Þ

where b ðxi Þ ¼ 1 ðxi xB ÞT B1 ðxi xB Þ þ 1ðHðxi Þyi ÞT R1 ðHðxi Þyi Þ: J i i i i 2 2

ð5Þ

Typically the same covariance matrix is used for all steps: Bi ¼ B0 for all ti. The optimization problem (4) is solved numerically using a gradient-based technique, with T 1 B 0 rxi Jb ðxi Þ ¼ B1 i ðxi xi Þ þ ðH ðxi ÞÞ Ri ðHðxi Þyi Þ:

ð6Þ

The 3D-Var gradient requires the transpose of the linearized observation operator H0 .

xA0 ¼ arg min J ðx0 Þ

subject to ð1Þ,

ð7Þ

where J ðx0 Þ ¼

N 1 1X B ðx0 xB0 ÞT B1 ðHðxi Þyi ÞT R1 ðHðxi Þyi Þ: 0 ðx0 x0 Þ þ 2 2i¼0

ð8Þ The model (1) propagates the optimal initial condition (8) forward in time to provide the analysis at future times, xAi ¼ Mt0 -ti xA0 . The gradient of (8), needed for a numerical solution of the optimization problem (7), reads B rx0 J ðx0 Þ ¼ B1 0 ðx0 x0 Þ

þ

 N  X @xi T

i¼0

@x0

ðH0 ðxi ÞÞT R1 i ðHðxi Þyi Þ:

ð9Þ

The 4D-Var gradient requires not only the linearized observation operator H0 , but also the adjoint of the linearized model (1) (i.e., the transposed derivative of future states with respect to the initial conditions),   @xi T MtTi -t0 ¼ : ð10Þ @x0 The 4D-Var gradient can be obtained effectively by forcing the adjoint model with observation increments, and running it backward in time. The construction of the adjoint model requires considerable development effort. The cost of running the adjoint model is typically a multiple of the cost of running the forward model (1).

12

K. Singh, A. Sandu / Computers & Geosciences 40 (2012) 10–18

3. Quasi-4D-Var (Q4D-Var) data assimilation

3.1. Gradient calculation without model adjoint

To decrease the development effort needed to build a full adjoint model, and to reduce the computational time associated with running it, we consider using approximate gradients in the numerical solution of the optimization problem (7). The approximate gradients are computed using simple approximations of the adjoint model (10). We call the resulting technique quasi-4D-Var. Consider the following quasi-adjoint model, which approximates the model adjoint derivative matrix (10):

The simplest approximation is to completely ignore model dynamics and set the quasi-adjoint to the identity matrix

NtTi -t0  MtTi -t0 ¼



@xi @x0

T :

ð11Þ

Using (11) in (9) leads to the following approximation of the adjoint gradient: B gðx0 Þ ¼ B1 0 ðx0 x0 Þ þ

N X

NtTi -t0 ðH0 ðxi ÞÞT R1 i ðHðxi Þyi Þ:

ð12Þ

i¼0

Assume that the quasi-adjoint approximation (11) is close enough to the exact adjoint, in the sense that JNtTi -t0 MtTi -t0 J rC

8i ¼ 0, . . . ,N, 8x0 : Jx0 xA0 J r R:

ð13Þ

Here C is a uniform bound for all observation times and for all initial conditions located in a ball of radius R around the optimal solution. The approximation error in the quasi-adjoint gradient (12) is

Dgðx0 Þ ¼ gðx0 Þrx0 J ðx0 Þ,

ð14Þ

NtTi -t0 ¼ Inn

8i ¼ 0, . . . ,N:

Assuming that the adjoint model is Lipschitz continuous in time, the quasi-adjoint error is bounded by a function C that does not depend on the model state in a neighborhood of the exact model solution, and depends smoothly on the length of the simulation interval JInn MtTi -t0 J r CðtN t0 Þ

8i ¼ 0, . . . ,N:

ð16Þ

In the worst case the dependency is exponential, CðtN t0 Þ ¼ ea9tN t0 9 but for practical models the increase of the bound with time can be much slower. Consider for example a stable linear system x0 ¼ Ax with A diagonalizable, A ¼ V  diag‘ fl‘ g  V 1 , and realðl‘ Þ o 0. In this case the bound depends only on the eigenvectors V and not on the time interval: JInn MtTi -t0 J ¼ JV T  diag‘ f1el‘ ðti t0 Þ g  V T J r condðVÞ: In summary, we consider that the bound Cðtt0 Þ increases monotonically with the length of the time interval t Zt0 ; this increase can be exponential in the worst case, but we expect that for a large class of models the increase is only moderate. The quasi-gradient reads B gðx0 Þ ¼ B1 0 ðx0 x0 Þ þ

N X

ðH0 ðxi ÞÞT R1 i ðHðxi Þyi Þ

ð17Þ

JðH0 ðxi ÞÞT R1 i ðHðxi Þyi ÞJ:

ð18Þ

i¼0

and has the error bound T

0

JðH ðxi ÞÞ

R1 i ðHðxi Þyi ÞJ:

ð15Þ

i¼0

JDgðx0 ÞJ r CðtN t0 Þ

N X i¼0

The quasi-gradient (17) is a direct extension of the 3D-Var gradient (12): the observations at t1 , . . . ,tN are all considered to have taken place at t0, and are used to correct x0 . The error bound (18) can be reduced by reducing the length of the assimilation window. For N ¼0 we are in the 3D-Var case, and the approximate gradient is exactly (12).

100

100

100

200

200

200

300 400 500

1000

Ozonesonde 3D−VAR Suboptimal KF Q4D No Adj Q4D Adv Adj 4D−VAR Free model run

0 100 200 300 Ozone Concentrations [ppbv]

Pressure [hPa]

Pressure [hPa]

The error bound (18) is reduced when the magnitude of the model–observation residuals JHðxi Þyi J decreases. In particular, if the model is accurate and the noise in the observations is small, the residuals become small close to the analysis trajectory. Thus the approximation (18) becomes more accurate as the iterations converge.

300 400 500

1000 −40 −20 0 20 Relative Difference [%]

Pressure [hPa]

JDgðx0 ÞJ rC

N X

300 400 500

1000

20 30 40 50 60 70 Standard Deviation [%]

Fig. 1. The results shown are for a five-day simulation from 00:00 GMT 1 August 2006 to 00:00 GMT 6 August 2006. Left panel: mean ozone concentrations at ozonesonde locations for 3D-Var, 4D-Var, and Q4D-Var analyses and free model trajectories. Center panel: relative mean errors of predicted ozone concentrations with respect to ozonesonde measurements. Right panel: standard deviation of errors with respect to ozonesonde measurements.

K. Singh, A. Sandu / Computers & Geosciences 40 (2012) 10–18

To reduce the error bound, one can assimilate multiple observations at a time chosen in the middle of the observation interval. Assuming the number of observation times N is even, we have the following quasi-gradient: B gðx0 Þ ¼ B1 0 ðx0 x0 Þ þ

N=2 X

ðH0 ðxi ÞÞT R1 i ðHðxi Þyi Þ:

13

Here t is a discretization parameter that controls the accuracy of the coarse adjoint model, and p is the corresponding order of accuracy. For example, the adjoint model can be run with larger time steps or with coarser grids; the coarse adjoint solution is then interpolated onto the grid of the forward model.

ð19Þ

i ¼ N=2

3.3. Gradient calculation with simplified physics adjoint

3.2. Gradient calculation with a coarse-resolution adjoint model An alternative approach to reduce the gradient CPU time is to run the adjoint model with a coarser resolution. In this case the quasi-adjoint approximation error is JNtTi -t0 MtTi -t0 J rC tp

8i ¼ 0, . . . ,N:

In the case of multiphysics systems, the quasi-adjoint can leave out physical processes that have a smaller impact on the system output. For example, in chemical transport models, the quasi-adjoint can include the transport but leave out the chemistry, particle processes, etc. This seems to be a reasonable approximation when the measurements of a certain chemical species (e.g., ozone columns) are used to adjust the initial concentration of the same species (e.g., ozone). By leaving out

Fig. 2. Differences in global ozone concentrations at 00:00 GMT on 6 August 2006 (end of assimilation window) averaged over the first 10 GEOS-Chem vertical levels. (a) Absolute difference between the 3D-Var analysis and the free model run. (b) Absolute difference between the 4D-Var analysis and the free model run. (c) Absolute difference between the Q4D-Var analysis using no model adjoint and the free model run. (d) Absolute difference between the Q4D-Var analysis using advection adjoint and the free model run. (e) Absolute difference between the Q4D-Var analysis using no model adjoint and 4D-Var analysis. (f) Relative difference between the Q4D-Var analysis using advection adjoint and 4D-Var analysis.

14

K. Singh, A. Sandu / Computers & Geosciences 40 (2012) 10–18

the chemistry, the data assimilation system ignores the correlations between multiple chemical species. We assume that the error in the quasi-adjoint approximation (13) with simplified physics is similar to (16). The approximation error can increase exponentially fast in the worst case, but the increase is expected to be only moderate for a large class of models. Note that the rate of error increase in adjoint is the same as the rate of error increase in the linearized forward model. A practical criterion to test how fast the approximation errors grow in time, and to assess the length of the assimilation window such that these errors remain sufficiently small, is as follows. The current full physics model run starts with x0 and produces the states xi at ti. Run the full physics model with a small perturbation added to b 0 ¼ x0 þ Dx0 , to obtain the perturbed states the initial conditions, x b i ¼ xi þ Dxi at ti. The ratio Cðti t0 Þ  JDxi J=JDx0 J gives an indicax tion of the error growth in the case where the adjoint is approximated by the identity operator (16). Similarly, a model run with simplified physics can be carried out starting from x0 , to

produce the states x~ i at ti. The norms of the differences Cðti t0 Þ  Jx~ i xi J give an indication of how fast the approximation errors grow in time when the full physics is replaced by simplified physics. 3.4. Impact of inexact gradients on the optimization process The use of inexact gradients in optimization has been extensively studied in the literature. In this section we review several results that are relevant for quasi-4D-Var. Kelley and Sachs (1999, Section 2.3.1) considers the behavior of errors in Newton’s method when the Hessian and the gradient are inexact. Let eðkÞ ¼ x0ðkÞ xA0 be the error in the numerical solution at iteration k. Let DHðx0 Þ be the error in the Hessian used by the Newton method. Then ðkÞ ðkÞ Jeðk þ 1Þ J rCðJeðkÞ J2 þ JDHðxðkÞ 0 ÞJ  Je J þJDgðx0 ÞJÞ:

ð20Þ

This implies that the accuracy of the optimal solution cannot be expected to exceed the accuracy of the approximate gradient.

100

100

200

200

200

300 400 Ozonesonde 3D−VAR Suboptimal KF Q4D No Adj Q4D Adv Adj 4D−VAR Free model run

500

1000

0

100

200

300

Ozone Concentrations [ppbv]

300 400 500

1000 −40 −20 0 20 Relative Difference [%]

Pressure [hPa]

100

Pressure [hPa]

Pressure [hPa]

Fig. 3. Errors in global ozone gradient fields for the five-days assimilation window. The differences between the Q4D-Var and the full 4D-Var adjoints at 00:00 GMT on 1 August 2006 (start of assimilation window) are averaged over the first 10 GEOS-Chem vertical levels. (a) Absolute difference between Q4D-Var gradient using no model adjoint and 4D-Var gradient and (b) between Q4D-Var gradient using advection adjoint and 4D-Var gradient.

300 400 500

1000

20

40

60

Standard Deviation [%]

Fig. 4. The results shown are for a two-week simulation from 00:00 GMT 1 August 2006 to 00:00 GMT 15 August 2006. Left-hand panel: mean ozone concentrations at ozonesonde locations for 3D-Var, 4D-Var and quasi-4D-Var analyses and free model trajectories. Center panel: Relative mean errors of predicted ozone concentrations with respect to ozonesonde measurements. Right-hand panel: Standard deviation of absolute values of errors with respect to ozonesonde measurements.

K. Singh, A. Sandu / Computers & Geosciences 40 (2012) 10–18

15

Carter (1991) has studied the convergence of trust region algorithms using inexact gradients. He established that if the gradient approximation error is such that

experiment) setting where the model–observations mismatch is zero for the optimal solution.

Jrx0 J ðx0 Þgðx0 ÞJ r z  Jrx0 J ðx0 ÞJ

4. Numerical experiments

for a constant z then the trust region algorithm converges strongly in the sense that for the iterates xðkÞ 0 , we have

In the numerical experiments we employ the GEOS-Chem v704-10 adjoint code, which is capable of performing both 3D-Var and 4D-Var data assimilation with real data. We assimilate tropospheric emission spectrometer (TES) satellite ozone profile retrievals into the model and compare the analyses against an independent observation dataset provided by direct ozone profile measurements by ozonesondes. The numerical optimization method used in all experiments is the limited memory boundconstrained BFGS (Zhu et al., 1997). This quasi-Newton approach

lim Jrx0 J ðxðkÞ 0 ÞJ ¼ 0:

k-1

Carter indicates that a typical value for the constant is z  0:8. The condition on the approximation error implies that the approximate gradient goes to zero near the exact solution (when the exact gradient goes to zero). This is possible in an idealized (twin

Fig. 5. Differences in global ozone concentrations at 00:00 GMT on 15 August 15 2006 (end of assimilation window) averaged over the first 10 GEOS-Chem vertical levels. (a) Absolute difference between the 3D-Var analysis and the free model run. (b) Absolute difference between the 4D-Var analysis and the free model run. (c) Absolute difference between the Q4D-Var analysis using no model adjoint and the free model run. (d) Absolute difference between the Q4D-Var analysis using advection adjoint and the free model run. (e) Absolute difference between the Q4D-Var analysis using no model adjoint and 4D-Var analysis. (f) Relative difference between the Q4D-Var analysis using advection adjoint and 4D-Var analysis.

16

K. Singh, A. Sandu / Computers & Geosciences 40 (2012) 10–18

has become the ‘‘gold standard’’ in solving large-scale chemical data assimilation problems (Sandu et al., 2005). 4.1. Experimental setting Simulations with GEOS-Chem v7 adjoint are carried out at 41  51 resolution in all our experiments. There are 46  72 latitude–longitude grid boxes at this resolution, and 55 vertical levels. Two assimilation windows are considered. The first covers five days of simulation starting at 00:00 GMT on 1 August 2006 and ending at 00:00 GMT on 6 August 2006. The second is for two simulation weeks starting at 00:00 GMT on 1 August 2006 and ending at 00:00 GMT on 15 August 2006. The initial conditions are generated through a free GEOS-Chem model run. A diagonal background covariance matrix is used, with variances equal to 10% of the initial ozone concentrations. GEOS-Chem tracks transport of ozone as part of the odd oxygen chemical family ([OX]¼ [O3]þ[O]þ[NO2]þ2[NO3]þ[HNO4]þ 3[N2O5]þ[PAN]þ [HNO3]). For the purpose of assessing assimilation results and gradients, it is sufficient to assume that the performance of [OX] tracks that of [O3], which largely drives [OX] concentrations. Data assimilation considers the TES-measured vertical profiles corresponding to the lowest 23 model vertical levels (for up to about 44 hPa) and adjusts only the ozone concentrations within these vertical levels. The observation operator called at model time t (hours) reads in all TES measurements available within the interval t  2 (hours) to t þ2 (hours). All observations within each 4-h interval are treated as instantaneous. The 3D-Var data assimilation adjusts ozone concentrations during each four hour interval. Eight optimization iterations are performed per analysis, since the cost function decreases significantly within the first few iterations. The 4D-Var data assimilation is carried out over a single five-day assimilation window spanning the entire simulation interval. Twelve optimization iterations are performed in order to adjust the ozone initial condition. The cost function and the adjoint gradient accumulate the impact of all 4-h data sets throughout the assimilation window. Note that while in other optimization studies the number of iterations is driven by an accuracy criterion such as sufficient decrease in the norm of the projected gradient, in practical data assimilation studies the maximum number of iterations is usually prescribed, as the computational costs are high, and a result needs to be returned in a fixed computer time budget.

The 4D-Var data assimilation experiments use the full adjoint gradient and two adjoint approximations. The first approximation is the no-adjoint approach (17). The second approximation is based on a simplified physics approach. The adjoint computation includes the advection process, but leaves out chemistry. Recall that 3D-Var data assimilation does not involve any model adjoint calculations; it only needs the adjoint of the observation operator. 4.2. Global ozone data assimilation We now compare the quality of the data assimilation performed with different approximations for the adjoint gradient. The results from a suboptimal Kalman filter are included for completeness. Fig. 1 displays the results for the five-day simulation. The average ozone columns measured by ozonesondes, and predicted by the model (before and after assimilation), are shown in the left-hand panel. The model errors are defined by the differences between the model-predicted ozone columns and the ones measured by ozonesondes. The center and right-hand panels show the mean model errors and their standard deviations, respectively. All assimilation methods improve the ozone predictions considerably. Under 200 hPa, somewhat surprisingly, the best results are given by the Q4D-Var with advection adjoint, followed closely by the full 4D-Var. 3D-Var and the suboptimal Kalman filter perform similarly. Q4D-Var with no adjoint provides an analysis with errors smaller than that of 3D-Var, but larger than that of full 4D-Var. Above 200 hPa the smallest errors are provided by the 3D-Var and the suboptimal Kalman filter. Fig. 2 displays the differences in ozone fields at the end of the assimilation window obtained with different model runs. Panels (a)–(d) show the differences between the analyses given by different assimilation approaches and the free model run. The corrections computed by 4D-Var and Q4D-Var with advection Table 1 Timing results for GEOS-Chem free model run, 3D-Var, Q4D-Var, and 4D-Var data assimilation experiments. All times are relative to the forward model run. Experiment description

Scaled CPU time

Free model run 3D-Var Q4D-Var, no adjoint (per iteration) Q4D-Var, advection adjoint (per iteration) 4D-Var (per iteration)

1.00 1.20 1.00 1.34 5.12

Fig. 6. Errors in global ozone gradient fields for the two-week assimilation window. The differences between the Q4D-Var and the full 4D-Var adjoints at 00:00 GMT on 1 August 2006 (start of assimilation window) are averaged over the first 10 GEOS-Chem vertical levels. (a) Absolute difference between Q4D-Var gradient using no model adjoint and 4D-Var gradient. (b) Absolute difference between Q4D-Var gradient using advection adjoint and 4D-Var gradient.

K. Singh, A. Sandu / Computers & Geosciences 40 (2012) 10–18

400

Forecast

250

200

Suboptimal KF 3D−VAR

150

4D−VAR

Q4D−VAR No Aadj Q4D−VAR Adv Aadj

100

|Analysis − Ozonesonde|

|Analysis − Ozonesonde|

300

200

400 600 Time (mins)

800

1000

Forecast

350 300

Q4D−VAR No Aadj

250 200 Suboptimal KF

150 100

0

17

Q4D−VAR Adv Aadj

3D−VAR

0

500

4D−VAR

1000 1500 Time (mins)

2000

2500

Fig. 7. The analysis error (the mean difference between model-predicted ozone and ozonesonde measurements) versus the computational time required for analysis for (a) the five-week assimilation window, and for (b) the two-week assimilation window.

adjoint are large, and have a similar pattern. The Q4D-Var with no adjoint performs a large correction as well, but the structure is slightly different than that of full 4D-Var. In particular, the ozone correction above Australia appears only with this approach. This is more clearly seen in panels (e) and (f), which show absolute differences between Q4D-Var analyses and the full 4D-Var analysis. Panel (f) reveals that the Q4D-Var with advection adjoint and the full 4D-Var solutions are similar. Panel (e) shows the differences in corrections between the no-adjoint assimilation and the full 4D-Var solution. Fig. 3 shows the differences between the full quasi-adjoint gradients and the full adjoint gradient for the five-day assimilation window. They represent the gradient approximation errors (14) corresponding to the background initial condition. The errors of Q4D-Var with no adjoint are much larger in absolute value, and are clustered around the TES observation points; this indicates that, in the absence of adjoint processes, the observation information is not spread across the computational domain. Fig. 4 displays the assimilation results at the end of the twoweek simulation. The average ozone columns measured by ozonesondes, and predicted by the model (before and after assimilation) are shown in the left-hand panel. The model errors are defined by the differences between the model-predicted ozone columns and the ones measured by ozonesondes. The center and right panels show the mean model errors and their standard deviations, respectively. The two-week results differ from the five-day results presented in Fig. 1. Full 4D-Var continues to provide an excellent solution. The quality of 3D-Var and suboptimal KF solutions is also good, as they have accumulated more information from observations. The Q4D-Var with advection adjoint provides a reasonable analysis, but its error is larger than for the five-day assimilation case, as expected. The version with no adjoint leads to a large mean error. This can be explained by the error estimate (18): for larger assimilation windows the quality of the approximate gradient decreases. The larger error in the gradient leads to a larger error in the optimal solution, as indicated by Eq. (20). Fig. 5 displays the differences in ozone fields at the end of the two-week assimilation window obtained with different model runs. Panels (b), (d), and (f) reveal that the corrections performed by 4D-Var and by Q4D-Var with advection adjoint are similar. The correction of 3D-Var, shown panel (a), has some resemblance with the 4D-Var corrections. The correction performed by Q4D-Var with no adjoint, shown in panel (c), is the smallest of all. This correction differs from 4D-Var, see panel (e), and is especially weak above North America. Fig. 6 shows the differences between the full quasi-adjoint gradients and the full adjoint gradient for the two-week assimilation

window. They represent the gradient approximation errors (14) corresponding to the background initial condition. As expected, these errors are larger than the ones observed for the shorter assimilation window, and displayed in Fig. 3. The errors of Q4DVar with no adjoint are again clustered around the TES observation points, indicating the absence of information spreading in this case. 4.3. Computational efficiency The computational costs of running the free model and each of the assimilation schemes are given in Table 1. The time per iteration for the no-adjoint Q4D-Var is indistinguishable from that of the forward model. The 3D-Var assimilation introduces a relatively small overhead associated with reading the data and performing a sequence of optimization runs. One iteration of the Q4D-Var with advection adjoint is slightly more costly than the forward model, which illustrates the fact that the advection adjoint is inexpensive. Each 4D-Var iteration runs the full adjoint model and is five times more expensive than the forward model. Another resource used by adjoint runs is the storage space associated with the checkpointing files. In our simulation the full adjoint requires about 12 GB of storage per simulation day, the advection adjoint about 480 MB per simulation day, and the noadjoint about 10 MB per simulation day. Fig. 7 shows the relative efficiency of different assimilation methods. The analysis error (the mean difference between modelpredicted ozone and ozonesonde measurements) is plotted against the computational time required for analysis. For both assimilation windows the 4D-Var solution has good quality, but is also the most expensive. The Q4D-Var analysis with advection adjoint is in between the 3D-Var and the 4D-Var solutions, in regard to both computational time and accuracy. The Q4D-Var with no adjoint works well only for the shorter analysis window.

5. Conclusions and future work In this paper we propose quasi-4D-Var (Q4D-Var) data assimilation, a technique that uses approximations of the gradient in 4D-Var data assimilation. The approximations of the gradient should be accurate enough to ensure progress of the numerical optimization algorithm toward the analysis. The approximate adjoint models are simpler to implement, and the adjoint code development effort is decreased. The approximate adjoints have fewer computations and fewer data to checkpoint, mainly intermediate forward concentrations. Consequently, the computational time associated with the computation of the 4D-Var gradient is reduced.

18

K. Singh, A. Sandu / Computers & Geosciences 40 (2012) 10–18

In the context of chemical transport modeling we discuss approximations of the adjoint solution operator by the identity operator and by the continuous advection adjoint. Numerical experiments are carried out with a global chemical data assimilation problem using satellite observations and the GEOS-Chem chemical transport model. These experiments reveal that Q4DVar is only slightly more expensive than 3D-Var. In the same time the accuracy of the resulting analyses is similar to that of 4D-Var for relatively short assimilation windows. For longer assimilation windows the advection based Q4D-Var still performs well, but the one without an adjoint leads to large errors. In summary, this work shows that it is possible, for specific applications, to drastically simplify the adjoint model and still obtain gradients that are useful for optimization in variational data assimilation. Simpler adjoints work better for shorter assimilation windows. Future work is needed to better understand how the approximations of adjoint gradients impact the convergence of the numerical optimization scheme, and how they impact the accuracy of the resulting analysis. As the conclusions are likely to be problem-dependent, one needs to determine the type of adjoint approximations that are sufficient for particular data assimilation problem settings. If this is done, then considerable savings in code development effort and in computational time can be achieved for particular applications.

Acknowledgments K. Singh and A. Sandu were partially supported by NASA through the ROSES-2005 AIST award led by Dr. M. Lee. A. Sandu was partially supported by the National Science Foundation through awards NSF DMS–0915047, NSF CCF–0916493, NSF OCI–0904397. The authors thank Drs. K. Bowman, D. Jones, and M. Parrington for the TES and ozonesonde data used in the numerical experiments, and for numerous discussions on the topic of chemical data assimilation. References Bei, N., de Foy, B., Lei, W., Zavala, M., Molina, L.T., 2008. Using 3D-Var data assimilation system to improve ozone simulations in the Mexico City basin. Atmospheric Chemistry and Physics 8, 7353–7366. doi:10.5194/acp-8-73532008. Carmichael, G.R., Sandu, A., Chai, T., Daescu, D., Constantinescu, E.M., Tang, Y., 2008. Predicting air quality: improvements through advanced methods to integrate models and measurements. Journal of Computational Physics 227 (7), 3540–3571. Carter, R.G., 1991. On the global convergence of trust region algorithms using inexact gradient information. SIAM Journal of Numerical Analysis 28 (1), 251–265. Chai, T., Carmichael, G.R., Tang, Y., Sandu, A., Hardesty, M., Pilewskie, P., Whitlow, S., Browell, E.V., Avery, M.A., Thouret, V., Nedelec, P., Merrill, J.T., Thomson, A.M., 2007. Four dimensional data assimilation experiments with ICARTT (International Consortium for Atmospheric Transport and Transformation) ozone measurements. Journal of Geophysical Research 112, D12S15. doi:10.1029/2006JD007763. Chai, T., Carmichael, G.R., Tang, Y., Sandu, A., Heckel, A., Richter, A., Burrows, J.P., 2009. Regional NOx emission inversion through a four-dimensional variational approach using SCIAMACHY tropospheric NO2 column observations. Atmospheric Environment 43 (32), 5046–5055. doi:10.1016/j.atmosenv.2009. 06.052.

Clark, H.L., Cathala, M.-L., Teysse’dre, H., Cammas, J.-P., Peuch, V.-H., 2006. Crosstropopause fluxes of ozone using assimilation of MOZAIC observations in a global CTM. Tellus, Series A and Series B 59B, 39–49. Constantinescu, E.M., Sandu, A., Chai, T., Carmichael, G.R., 2007a. Investigation of ensemble-based chemical data assimilation in an idealized setting. Atmospheric Environment 41 (1), 18–36. Constantinescu, E.M., Sandu, A., Chai, T., Carmichael, G.R., 2007b. Ensemble-based chemical data assimilation. I: General approach. Quarterly Journal of the Royal Meteorological Society 133 (626), 1229–1243. Courtier, P., Talagrand, O., 1987. Variational assimilation of meteorological observations with the adjoint vorticity equation. 2. Numerical results. Quarterly Journal of the Royal Meteorological Society 113, 1329–1347. Daley, R., 1991. Atmospheric Data Analysis. Cambridge University Press, Cambridge, UK 457 pp.. Elbern, H., Schmidt, H., 2001. Ozone episode analysis by four dimensional variational chemistry data assimilation. Journal of Geophysical Research [Atmospheres] 106, 3569–3590. Geer, A.J., et al., 2006. The ASSET intercomparison of ozone analyses: Method and first results. Atmospheric Chemistry and Physics 6, 5445–5474. Hakami, A., Henze, D.K., Seinfeld, J.H., Chai, T., Tang, Y., Carmichael, G.R., Sandu, A., 2005. Adjoint inverse modeling of black carbon during ACE-Asia. Journal of Geophysical Research 110, D14301. Henze, D.K., Seinfeld, J.H., Shindell, D.T., 2009. Inverse modeling and mapping U.S. air quality influences of inorganic PM2.5 precursor emissions with the adjoint of GEOS-Chem. Atmospheric Chemistry and Physics 9, 5877–5903. Kalnay, E., 2002. Atmospheric Modeling, Data Assimilation and Predictability. Cambridge University Press, Cambridge, UK. Kelley, C.T., Sachs, E.W., 1999. Truncated Newton methods for optimization with inaccurate functions and gradients. SIAM Journal of Optimization 10, 43–55. Khattatov, B.V., Lamarque, J.-F., Lyjak, L.V., Menard, R., Levelt, P., Tie, X., Brasseur, G.P., Gille, J.C., 2000. Assimilation of satellite observations of long-lived chemical species in global chemistry transport models. Journal of Geophysical Research 105 (D23), 29–135. Lamarque, J.-F., Khattatov, B.V., Gille, J.C., 2002. Constraining tropospheric ozone column through data assimilation. Journal of Geophysical Research 107 (D22), 4651. doi:10.1029/2001JD001249. LeDimet, F.-X., Talagrand, O., 1986. Variational algorithms for analysis and assimilation of meteorological observations: theoretical aspects. Tellus 38A, 97–110. Lions, J.L., 1971. Optimal Control of Systems Governed by Partial Differential Equations. Springer-Verlag, Berlin, New York. Menard, R., Cohn, S.E., Chang, L.-P., Lyster, P.M., 2000. Assimilation of stratospheric chemical tracer observations using a Kalman filter: I. Formulation. Monthly Weather Review 128, 2654–2671. Navon, I.M., 2009. Data Assimilation for Numerical Weather Prediction: A Review, in: Data Assimilation for Atmospheric, Oceanic, and Hydrologic Applications, vol. 18, 475 pp., 326 illus. Parrington, M., Jones, D.B.A., Bowman, K.W., Thompson, A.M., Tarasick, D.W., Merrill, J., Oltmans, S.J., Leblanc, T., Witte, J.C., Millet, D.B., 2009. Impact of the assimilation of ozone from the tropospheric emission spectrometer on surface ozone across North America. Geophysical Research Letters 36 (4). Pierce, R.B., et al., 2007. Chemical data assimilation estimates of continental U. S. ozone and nitrogen budgets during the Intercontinental chemical transport Experiment—North America. Journal of Geophysical Research 112, D12S21. doi:10.1029/2006JD007722. Sandu, A., Daescu, D.N., Carmichael, G.R., Chai, T., 2005. Adjoint sensitivity analysis of regional air quality models. Journal of Computational Physics 204, 222–252. Segers, A.J., Eskes, H.J., van der A, R.J., van Oss, R.F., van Velthoven, P.F.J., 2005. Assimilation of GOME ozone profiles and a global chemistry-transport model, using a Kalman filter with anisotropic covariance. Quarterly Journal of the Royal Meteorological Society 131, 477–502. Wu, L., Mallet, V., Bocquet, M., Sportisse, B., 2008. A comparison study of data assimilation algorithms for ozone forecasts. Journal of Geophysical Research 113, D20310. Zhang, L., Constantinescu, E.M., Sandu, A., Tang, Y., Chai, T., Carmichael, G.R., Byun, D., Olaguer, E., 2008. An adjoint sensitivity analysis and 4D-Var data assimilation study of Texas air quality. Atmospheric Environment 42 (23), 5787–5804. Zhu, C., Byrd, R.H., Nocedal, J., 1997. L-BFGS-B: Algorithm 778: L-BFGS-B, FORTRAN routines for large scale bound constrained optimization. ACM Transactions on Mathematical Software 23 (4), 550–560.