Elsevier

Environmental Modelling & Software

Volume 72, October 2015, Pages 402-417
Environmental Modelling & Software

Analysis and classification of data sets for calibration and validation of agro-ecosystem models

https://doi.org/10.1016/j.envsoft.2015.05.009Get rights and content

Highlights

  • A software is presented to classify and label data suitability for modelling.

  • Data requirements for modelling are specific and vary with model purpose.

  • Quantitative classification of data sets facilitates their use for modelling.

  • Test of model and data consistency improves data usability.

  • Guidelines for experimentalist to improve data suitability for modelling.

Abstract

Experimental field data are used at different levels of complexity to calibrate, validate and improve agro-ecosystem models to enhance their reliability for regional impact assessment. A methodological framework and software are presented to evaluate and classify data sets into four classes regarding their suitability for different modelling purposes. Weighting of inputs and variables for testing was set from the aspect of crop modelling. The software allows users to adjust weights according to their specific requirements. Background information is given for the variables with respect to their relevance for modelling and possible uncertainties. Examples are given for data sets of the different classes. The framework helps to assemble high quality data bases, to select data from data bases according to modellers requirements and gives guidelines to experimentalists for experimental design and decide on the most effective measurements to improve the usefulness of their data for modelling, statistical analysis and data assimilation.

Introduction

Soil–crop–atmosphere interactions play a central role in the multiple functions of agro-ecosystems and rural landscapes such as food and energy production, carbon sequestration, soil properties, biodiversity or conservation of water resources. Emissions from agriculture are also seen as a threat to the global climate system, which makes agriculture one of the key handles for climate change mitigation. There is an increasing need to better understand these complex systems, and to develop and utilize reliable process-based models for scenario analyses as a basis for policy and management decisions. Agro-ecosystem models are increasingly applied beyond the point and field scales to support decision-making (Van Ittersum et al., 2003, Jones et al., 2003, Brisson et al., 2003, Stockle et al., 2003, Keating et al., 2003), assess the impact of climate change (Holzworth et al., 2015, position paper of thematic issue), and to derive adaptation and mitigation strategies for the sustainable use and management of land and other natural resources (Hammer et al., 2002, White et al., 2011). Integrated Assessment and Modelling as suggested by Parker et al. (2002) requires the integration of dispersed data sources in a consistent and spatially and temporarily complete data set to provide necessary model inputs for decision making (Janssen et al., 2009) and to transfer site-based knowledge to regions and continents. With increasing size of the area under investigation, input data tend to become more uncertain relative to the point data of experimental sites, which were the original basis of development for the majority of agro-ecosystem models. Hence, model uncertainty also increases with the area under investigation since data of relevant state variables for testing and evaluation are not commonly available.

Critical to the evaluation, improvement, and use of crop models is the availability of high quality data from field observations. There is a mismatch between the rising demand by users for tested models and research budgets for suitable experimental research and monitoring, which tend to be decreasing (Rötter et al., 2011).

Since field experimental data sets are usually not recorded for modelling purposes, their level of detail, quality of records, variables considered as well as their number of spatial and temporal replicates vary enormously (Nix, 1985, Groot and Verberne, 1991) Therefore, their suitability for modelling is often insufficient for different reasons. White et al. (2013) proposed a standard approach for describing and identifying variables of management, environmental conditions, soils, and crop measurements, all for the purpose of developing, testing, and applying crop simulation models. In general, datasets used for model calibration and validation consist of data describing a) the initial soil conditions, b) the crop-specific management and c) the seasonal weather conditions (Palosuo et al., 2011, Rötter et al., 2012, White et al., 2013). Additionally, data on phenology of the crop, yields and nutrient contents from intermediate harvests, intra-seasonal soil conditions and measurements of fluxes of energy, water and CO2 may be provided.

The international community of agricultural system modellers, e.g., in the Agricultural Model Intercomparison and Improvement Project AgMIP (Rosenzweig et al., 2013) or the European MACSUR (Modelling European Agriculture with Climate Change for Food SecURity) project (Rötter et al., 2013) are currently building harmonized data bases for the purpose of model testing and improvement including the opportunity to create model-specific interfaces for various models (Porter et al., 2014). In order to find suitable experimental data for specific applications in the context of modelling out of the vast offer of available data sets, a transparent method of screening and pre-selection is demanded, which highlights specific positive and negative features of a data set with respect to the intended application. To evaluate and select data sets, Rosenzweig et al. (2013) proposed different classes of data for so-called “Sentinel Sites”, which represent specific sites with experimental data suitable for different levels of model testing and improvement. However, specific for a transparent classification were not provided. A joint community effort lead to the development of a qualitative (Boote et al., 2015) and a quantitative framework (this publication) to evaluate the quality of field experimental data sets for crop modelling according to robust and accepted criteria.

The aim of this paper is to provide a quantitative classification framework by which the consistency and quality of agricultural datasets can be evaluated. Variables under consideration are weighted according to both their importance and their quality, and justified by literature describing variance and errors of the different state variables and measurement methods. The objective of such a classification framework of data evaluation and labelling is (i) to allow data base managers to pre-check the quality of data sets before integrating them into their data base, (ii) to support the creation and use of international publicly available benchmarked data sets for model evaluation, inter-comparison and improvement, (iii) to enable modellers to select appropriate data according to their requirements, (iv) to give guidance to experimentalists for designing their experiments with respect to aspects that go beyond their primary research question, allowing for a broader use of experimental data for systems analysis and modelling.

Section snippets

Definitions and terminology

Parameterisation means the estimation of fixed model parameter values (e.g., diffusion coefficient of a substance in water) for single processes under controlled conditions.

Calibration means the adjustment of values of model parameters outside the model code (e.g. thermal sums for phenological development in external parameter files) to fit their output to a set of measured state variables or fluxes (Penning de Vries and van Laar, 1982). According to Van Keulen (1976), the main purpose of

Data requirements for model calibration and validation

Application of a model in a new geographic/climatic environment or for a new crop requires new parameterisation and eventually modifications of the model, e.g. by consideration of additional processes; otherwise parameter adjustment to fit observed data becomes a pure tuning or curve fitting exercise (De Wit, 1982). Such extension of a model requires suitable data to identify and parameterise processes and it sometimes requires a re-calibration of parameters of other processes or modules if

Quality and uncertainty of observations

Field data are never free of error (random or bias), which challenges model calibration and validation (van Keulen and Seligman, 1987). This constrains desired stringency in model evaluation: If random error were the only source, it would, as a rule make sense to consider model results satisfactory, as long as they are within the standard error (SE) of the data, because the SE is the statistical measure of accuracy. However, SE provides no information on bias, so there will always be

Application of the evaluation framework on datasets

To examine the outcome of the evaluation framework we applied it to three datasets used for various modelling studies with two periods at Müncheberg/Germany (1993–1995 and 1996–1998), and one at Lednice/Czech Republic (1992–2006).

The “Müncheberg” data set, described in detail by Mirschel et al. (2007), has been used (at least partially) for model inter-comparisons (Kersebaum et al., 2007, Palosuo et al., 2011). Table 4, Table 5 provide information for the two sites on data availability and the

Discussion

Several attempts have been made to describe data requirements for agro-ecosystem modelling. Minimum data requirements for crop modelling were defined by Nix (1984) and were manifested in the IBSNAT data base (Tsuji et al., 2002). Some data quality requirements were inherently included in the IBSNAT and ICASA protocol and best methods for growth sampling and specified minimal land-area for growth samples and minimum land-areas for final yields were specified, to insure data quality (Boote, 1999,

Conclusions

Modeller groups have so far mainly used their specific data sets and formats to develop and evaluate their models (Holzworth et al., 2015). Some groups have access to large data bases to evaluate their models over a wide range of conditions and cropping systems (e.g. Coucheney et al., 2015). The different formats and requirements aggravate the joint usage of data by different models and hampers collaboration between modeller groups (Holzworth et al., 2015). The common use of data by multi-model

Acknowledgements

The study was financially supported by the related national European ministries contributing to the MACSUR project (2812ERA_147; _154; _115; _42; _17; _189; _92; _99; _196) under FACCE-JPI (031A103B), and it was further an integrated activity under AgMIP (Agricultural Model Intercomparison and Improvement Project) supported by the US Department of Agriculture and the United Kingdom Department for International Development.

References (107)

  • L.A. Hunt et al.

    Agronomic data: advances in documentation and protocols for exchange and use

    Agric. Syst.

    (2001)
  • S. Janssen et al.

    A database for integrated assessment of European agricultural systems

    Environ. Sci. Pollut.

    (2009)
  • J.W. Jones et al.

    The DSSAT cropping system model

    Eur. J. Agron.

    (2003)
  • B.A. Keating et al.

    An overview of APSIM, a model designed for farming systems simulation

    Eur. J. Agron.

    (2003)
  • K.C. Kersebaum et al.

    Operational use of agro-meteorological data and GIS to derive site specific nitrogen fertilizer recommendations based on the simulation of soil and crop growth processes

    Phys. Chem. Earth

    (2005)
  • K.C. Kersebaum et al.

    Site-specific impacts of climate change on wheat production across regions in Germany using different CO2 response functions

    Eur. J. Agron.

    (2014)
  • B.A. Kimball et al.

    Responses of agricultural crops to free-air CO2 enrichment

    Adv. Agron.

    (2002)
  • H. Mittelbach et al.

    Comparison of four soil moisture sensor types under field conditions in Switzerland

    J. Hydrol.

    (2012)
  • C. Nendel et al.

    The MONICA model: testing predictability for crop growth, soil moisture and nitrogen dynamics

    Ecol. Model.

    (2011)
  • Y. Pachepsky et al.

    Stochastic imaging of soil parameters to assess variability and uncertainty of crop yield estimates

    Geoderma

    (1998)
  • T. Palosuo et al.

    Simulation of winter wheat yield and its variability in different climates of Europe: a comparison of eight crop growth models

    Eur. J. Agron.

    (2011)
  • P. Parker et al.

    Progress in integrated assessment and modelling

    Environ. Model. Softw.

    (2002)
  • C.H. Porter et al.

    Harmonization and translation of crop modeling data to ensure interoperability

    Environ. Model. Softw.

    (2014)
  • H.I. Reuter et al.

    Applications in precision agriculture

  • R.P. Rötter et al.

    Simulation of spring barley yield in different climatic zones of Northern and Central Europe: a comparison of nine crop models

    Field Crops Res.

    (2012)
  • C. Rosenzweig et al.

    The agricultural model intercomparison and improvement project (AgMIP): protocols and pilot studies

    Agric. For. Met.

    (2013)
  • C.O. Stockle et al.

    CropSyst, a cropping systems simulation model

    Eur. J. Agron.

    (2003)
  • M. Trnka et al.

    Simple snow cover model for agrometeorological applications

    Agric. For. Met.

    (2010)
  • M. Trnka et al.

    Quantification of uncertainties introduced by selected methods for daily global solar radiation estimation

    Agric. For. Met.

    (2005)
  • F.N. Tubiello et al.

    Crop response to elevated CO2 and world food supply: a comment on “Food for Thought…” by Long et al., Science 312:1918–1921, 2006

    Eur. J. Agron.

    (2007)
  • H. Webber et al.

    What role can crop models play in supporting climate change adaptation decisions to enhance food security in Sub-Saharan Africa?

    Agric. Syst.

    (2014)
  • J.W. White et al.

    Methodologies for simulating impacts of climate change on crop production

    Field Crops Res.

    (2011)
  • J.W. White et al.

    Integrated description of agricultural field experiments and production: the ICASA version 2.0 data standards

    Comput. Electron. Agric.

    (2013)
  • F. Abbas et al.

    Field calibrations of soil moisture sensors in a forested watershed

    Sensors

    (2011)
  • E.A. Ainsworth et al.

    FACE-ing the facts: inconsistencies and interdependence among field, chamber and modeling studies of elevated [CO2] impacts on crop yield and food supply

    New. Phytol.

    (2008)
  • C.M. Ashraf et al.

    Wheat seed germination under low temperature and moisture stress

    Agron. J.

    (1970)
  • S. Asseng et al.

    Quantifying uncertainties in simulating wheat yields under climate change

    Nat. Clim. Change

    (2013)
  • S. Asseng et al.

    The impact of temperature variability on wheat yields

    Glob. Change Biol.

    (2011)
  • G.A. Baigorria et al.

    Understanding rainfall spatial variability in southeast USA at different timescales

    Int. J. Climatol.

    (2007)
  • P.C. Baveye et al.

    Moving away from the geostatistical lamppost: why, where, and how does the spatial heterogeneity of soils matter?

    Ecol. Model.

    (2014)
  • A. Blum

    Crop responses to drought and the interpretation of adaptation

    Plant Growth Regul.

    (1996)
  • K.J. Boote

    Data requirements for model evaluation and techniques for sampling crop growth and development

  • K.J. Boote et al.

    Sentinel site data for crop model improvement – definition and characterization

  • J.A. Bunce

    Effects of pulses of elevated carbon dioxide concentration on stomatal conductance and photosynthesis in wheat and rice

    Physiol. Plant.

    (2013)
  • W. Cao et al.

    Modelling phasic development in wheat: a conceptual integration of physiological components

    J. Agric. Sci.

    (1997)
  • I.S. Dahiya et al.

    Spatial variability of some nutrient constituents of an Alfisol from loess. I. Classical statistical analysis

    Z. Pflanzenern. Bodenkd.

    (1984)
  • N. Deveau et al.

    Soil pH controls the environmental availability of phosphorus: experimental and mechanistic modelling approaches

    Appl. Geochem.

    (2009)
  • C.T. De Wit

    Simulation of living systems

  • J. Eitzinger et al.

    Sensitivities of crop models to extreme weather conditions during flowering period demonstrated for maize and winter wheat in Austria

    J. Agric. Sci.

    (2013)
  • A.J. Elmore et al.

    Landscape controls on the timing of spring, autumn, and growing season length in mid-Atlantic forests

    Glob. Change Biol.

    (2012)
  • Cited by (114)

    View all citing articles on Scopus

    Special Issue on Agricultural Systems Modeling & Software.

    View full text