Skip to main content


Showing posts from January, 2020

A workflow for distributed parallel data analysis on HPC with checkpoint

A typical task we do nowadays is to submit a job to the cluster to run some data analysis. But there are some limitations we can do as I know, to some extend.

Lots of tasks take a long time to run, which means the Walltime must be large even with multiple cores;HPC queue is busy and it takes forever to wait in the queue;If a job failed, we have to start over; Therefore, I have designed a protocol with workflow to resolve these issues. It uses MPI for parallel computing, so we can make use of multiple nodes to speed up;It provides a checkpoint feature, so it can restart if something went wrong;It supports automate resubmit if the Walltime is not enough.
There are several implementations depending on the system. For example, on the SLURM system, a recurring job method can be used.
This design is expected to be able handle normal operations. However, there is a catch. It makes some assumption about the work load of individual slave node: it assumes that within each walltime, all the slave …

The life span of a project

Recently I have been working on a project that I need to prepare some maps. I had to go back to use some code I wrote almost 5 years ago. Then I realized that this happened to me many times.

I once read a line saying "If you think you are going do this again in the future, you should write a code for it". This is pretty much the major reason that drivers me to write lots of codes.

Sometimes I write code for fun, to demonstrate or test some ideas. (Which I plan to share some example in another post). For example, I wrote a program to test different method for carbon cycle using Explicit method or matrix method (Link).

But for most time, I wrote code so I don't have to manually do something. Just image if you have to open 10k excel files to do some simple math!

(I never wanted to write a programming language, at least for now. Most of my codes work to solve a real world problem.)

And more importantly, I always look ahead.

For example, if I decide to write a program to cal…

A review on dissolved organic carbon modeling

Researchgate and Mendeley recommend some nice papers to me based on my own publications. Here are some quick review on the DOC modeling I read today.

The first one:
Simulation of dissolved organic carbon concentrations and fluxes in Chinese monsoon forest ecosystems using a modified TRIPLEX-DOC model
This model is unlikely ready for spatial simulation. It does not consider DOC from litterfall as well.
There is some confusion about the term "DOC leaching", which should include both from litter and soil.

The second one:
ORCHIDEE MICT-LEAK (r5459), a global model for the production, transport, and transformation of dissolved organic carbon from Arctic permafrost regions–Part 1: Rationale, model description, and simulation protocol
This model seems to very complex in terms for DOC modeling. It does capture some most important processes. But some statements are not convincing due to the complexity of the model. Also, it does not consider lateral flow process well enough.


Paper discussion Streamflow in the Columbia River Basin

Streamflow in the Columbia River Basin: Quantifying Changes Over the Period 1951‐2008 and Determining the Drivers of Those Changes

Routing Application for Parallel computatIon of Discharge (RAPID) is a matrix based river routing model.

My concern is how to map ELM runoff to RAPID because the scales of these two models are very different. ELM is usually at 50km to 100km and RAPID might be around 100m. If the resolutions are different, great uncertainty might be introduced.

A review on litter decomposition modeling

Litter decomposition involves series of processes.

Hydrological control nearly occurs throughout the whole decomposition process. As a result, if there is water-soluble materials in the litter, the leaching will take them out.

The water-soluble materials include Dissolved Organic Carbon (DOC).

Also some particulate materials (Particulate Organic Carbon (POC), etc.) may leach out.

The impact of ice content on hydraulic conductivity

In most aquifer test, the hydraulic conductivity is a function of the material, but it does not explicitly consider the impact of ice content on K.
When soil or any material is partially or fully frozen, its actual K will decrease significantly. Here I want to explore how to model this impact in soil hydrology model. Special attention will be paid to the different impacts on vertical and horizontal K, respectively.