Skip to main content

The journey of a three-dimensional ecosystem model debugging and calibration

Currently I am working on the model development, debugging and calibration of the ECO3D model.

Due to the model complexity, I have spent great efforts to get things done in an appropriate way. Throughout the process, I have also acquired much experiences in model developments. Here I want to share some of the most useful tips. However, I will not discuss the technical aspects such as program language unless inevitable.
  1. You may use OpenMP to speed up the program, but it appears that the debugger, such as the one I use, Totalview, is not quite friendly when debugging in multiple-thread. So I suggest you can prepare two versions of CMakeList files, so you can switch between for different purposes;
  2. Use conditional breakpoint in Totalview can save you some time when you are only interested in certain point/condition;
  3. Ecosystem models, or Earth system models, require a "spin-up" simulation, which usually takes a great length of time ranging from hours to days. So you don't want to do it every time when you test your model. To save time as well as HPC resources, we can save the steady state system state variables into files so we can restart from them to avoid repeated running;
  4. If your simulation objects are independent with each other under certain assumption, for example, the shortwave radiation of one pixel is independent with nearby pixels, you can speed up the debugging by shutting down/mask out the other "useless" pixel. In this way, you can focus on the only pixel that gives you troubles;
  5. In other scenarios, such as the cascade/lateral flow simulation, there is no way to test the model without a complete simulation because pixels are communicating with each other. Even though we have to run all the pixels, that doesn't mean we have to output them, and we can still output single pixel for debug purpose;
  6. Some debugging information is useful for us to diagnose the problem. Often we can output (print, cout, etc.) the information directly on the terminal because it is the default I/O. We can also output them to a CSV file, so we can check the time series trend easily using Microsoft Excel. Having a deep understanding of the trend definitely help you understand what is causing the changes;
  7. Even on the most powerful HPC which provides a debug queue, the walltime (less than an hour?) is not enough for debugging. So we have to use queue with long walltime to debug. Also, because GUI debug causes a lot of communications, you should use Remote desktop hosted on HPC (ThinLinc, etc.) instead of local X windows. You should run the debug as an interactive job for as long as the debugging requires;
  8. Use debug flag control the workflow efficiently, so you don't have to recompile your program every time you change something;
  9. Before you draw any conclusion about the algorithm, double check the input file with care. Lots of time we made assumptions about input data that are not true. Sometimes you can manually modify/create some input for testing purpose;
  10. Be consistent with some logical controls, if you change some important control, remember to check all the related ones. If some variables are used across modules, you should be even more careful;
  11. Keep more than one copy of the workspace with the same file system structure to speed up some checking processes. Both ftp/sftp and GUI tools can be used for data transfer. If you have to transfer large amount of data, consider Globus
Let me know if you have any questions.




Comments

Popular posts from this blog

Spatial datasets operations: mask raster using region of interest

Climate change related studies usually involve spatial datasets extraction from a larger domain.
In this article, I will briefly discuss some potential issues and solutions.

In the most common scenario, we need to extract a raster file using a polygon based shapefile. And I will focus as an example.

In a typical desktop application such as ArcMap or ENVI, this is usually done with a tool called clip or extract using mask or ROI.

Before any analysis can be done, it is the best practice to project all datasets into the same projection.

If you are lucky enough, you may find that the polygon you will use actually matches up with the raster grid perfectly. But it rarely happens unless you created the shapefile using "fishnet" or other approaches.

What if luck is not with you? The algorithm within these tool usually will make the best estimate of the value based on the location. The nearest re-sample, but not limited to, will be used to calculate the value. But what about the outp…

Numerical simulation: ode/pde solver and spin-up

For Earth Science model development, I inevitably have to deal with ODE and PDE equations. I also have come across some discussion related to this topic, i.e.,

https://www.researchgate.net/post/What_does_one_mean_by_Model_Spin_Up_Time

In an attempt to answer this question, as well as redefine the problem I am dealing with, I decided to organize some materials to illustrate our current state on this topic.

Models are essentially equations. In Earth Science, these equations are usually ODE or PDE. So I want to discuss this from a mathematical perspective.

Ideally, we want to solve these ODE/PDE with initial condition (IC) and boundary condition (BC) using various numerical methods.
https://en.wikipedia.org/wiki/Initial_value_problem
https://en.wikipedia.org/wiki/Boundary_value_problem

Because of the nature of geology, everything is similar to its neighbors. So we can construct a system of equations which may have multiple equation for each single grid cell. Now we have an array of equation…

Lessons I have learnt during E3SM development

I have been involved with the E3SM development since I joined PNNL as a postdoc. Over the course of time, I have learnt a lot from the E3SM model. I also found many issues within the model, which reflects lots of similar struggles in the lifespan of software engineering.

Here I list a few major ones that we all dislike but they are around in almost every project we have worked on.

Excessive usage of existing framework even it is not meant to Working in a large project means that you should NOT re-invent the wheels if they are already there. But more often, developers tend to use existing data types and functions even when they were not designed to do so. The reason is simple: it is easier to use existing ones than to create new ones. For example, in E3SM, there was not a data type to transfer data between river and land. Instead, developers use the data type designed for atmosphere and land to do the job. While it is ok to do so, it added unnecessary confusion for future development a…