Skip to main content

Ecosystem modeling: how to deal with the time-variant datasets?

Ecosystem modeling usually requires a variety of data to run a complete simulation. These data usually include both time-variant and time-invariant datasets. For example, we usually consider Digital Elevation Model (DEM) as time-invariant data because surface topography is relatively stable for a given period of time unless extreme events such as earthquake occur.

However, most other driving datasets are actually time-variant. For example, climate data (temperature, precipitation.etc.) are constantly changing at any given time.

To date, there is yet no accurate definition whether a data should be defined as time-variant or time-invariant. And we always have to make some assumptions to simplify our models.

There are several reasons behind this and how they may be improved in future study.
First, using time-variant data requires more data. For example, for an ecosystem model at daily time step, daily climate data are also required. While it may be relatively easy to retrieve climate data from meteorology sites, it may be much difficult to retrieve other types of data. For example, soil data survey usually requires much effort to produce a reliable soil map. It is practically impossible to provide soil data at daily time step.

Second, using time-variant data implies more computational demand and storage quota. For large ecosystem models, data preparation and I/O are among the most challenging jobs for a successful simulation.
I will use one of my current projects as an example. For a 40-year simulation, I use approximate 6 time-variant datasets include temperature, precipitation and vapor pressure. That is 40*365*6=87600 files of matrix. Needless to say the additional data produced during the preparation process, that amount of data will place certain requirement on the computational power and storage.transfer capacity.

Third, using time-variant data also make the model much more complex and complicated. It is obvious the model itself will become much more complex since data I/O inside the model will increase significantly. In addition, the time-variant data usually imply that some of our components must be flexible enough to be compatible with the changes.For example, if the land cover is changed due to wild fire within one week, what happens to the soil properties during this period? Does the surface albedo change linearly or nonlinear?

To conclude, there remains many challenges in time-variant datasets related subjects. Some of them can be gradually addressed using advanced approaches. For example, high performance computer (HPC) are more and more used to address the computational demand and storage capacity. Better data structures or formats such as HDF/NetCDF are also widely used to improve the data I/O.

Further, the first step is still to define whether a data is time-variant or time-invariant.
Here I list a brief table to illustrate how I define different data in one of my current projects.
Data Time-variant/invariant Note
Temperature Variant Daily
Precipitation Variant Daily
Land use and land cover type Variant Annually, but when extreme event occurs, it can change
Vegetation type Variant Annually, but when extreme event occurs, it can change
Soil type Invariant Change only extreme event occurs

Comments

Popular posts from this blog

Spatial datasets operations: mask raster using region of interest

Climate change related studies usually involve spatial datasets extraction from a larger domain.
In this article, I will briefly discuss some potential issues and solutions.

In the most common scenario, we need to extract a raster file using a polygon based shapefile. And I will focus as an example.

In a typical desktop application such as ArcMap or ENVI, this is usually done with a tool called clip or extract using mask or ROI.

Before any analysis can be done, it is the best practice to project all datasets into the same projection.

If you are lucky enough, you may find that the polygon you will use actually matches up with the raster grid perfectly. But it rarely happens unless you created the shapefile using "fishnet" or other approaches.

What if luck is not with you? The algorithm within these tool usually will make the best estimate of the value based on the location. The nearest re-sample, but not limited to, will be used to calculate the value. But what about the outp…

Numerical simulation: ode/pde solver and spin-up

For Earth Science model development, I inevitably have to deal with ODE and PDE equations. I also have come across some discussion related to this topic, i.e.,

https://www.researchgate.net/post/What_does_one_mean_by_Model_Spin_Up_Time

In an attempt to answer this question, as well as redefine the problem I am dealing with, I decided to organize some materials to illustrate our current state on this topic.

Models are essentially equations. In Earth Science, these equations are usually ODE or PDE. So I want to discuss this from a mathematical perspective.

Ideally, we want to solve these ODE/PDE with initial condition (IC) and boundary condition (BC) using various numerical methods.
https://en.wikipedia.org/wiki/Initial_value_problem
https://en.wikipedia.org/wiki/Boundary_value_problem

Because of the nature of geology, everything is similar to its neighbors. So we can construct a system of equations which may have multiple equation for each single grid cell. Now we have an array of equation…

Watershed Delineation On A Hexagonal Mesh Grid: Part A

One of our recent publications is "Watershed Delineation On A Hexagonal Mesh Grid" published on Environmental Modeling and Software (link).
Here I want to provide some behind the scene details of this study.

(The figures are high resolution, you might need to zoom in to view.)

First, I'd like to introduce the motivation of this work. Many of us including me have done lots of watershed/catchment hydrology modeling. For example, one of my recent publications is a three-dimensional carbon-water cycle modeling work (link), which uses lots of watershed hydrology algorithms.
In principle, watershed hydrology should be applied to large spatial domain, even global scale. But why no one is doing it?  I will use the popular USDA SWAT model as an example. Why no one is setting up a SWAT model globally? 
There are several reasons we cannot use SWAT at global scale: We cannot produce a global DEM with a desired map projection. SWAT model relies on stream network, which depends on DEM.…