Skip to main content

High performance computing: how to run IDL program on a cluster

One of the advantages of IDL is that it could be deployed on Linux system, to say a HPC (High Performance Computer). Fortunately my research group have this computational resource to advance my research.

So the question is how to run IDL programs on HPC?
Basically of course you can use X-Windows and IDLDE like what you did in Windows system.
But that is NOT what I am about to cover in this discussion.

Before I start, you properly could refer to this page first.
Some core idea is as follow:

#!/bin/bash
: The above line tells Linux to use the shell /bin/bash to execute
: this script. That must be the first line in the script.
: You must have no lines beginning with # before these
: PBS lines other than the /bin/bash line:
#PBS -N 'hello_parallel'
#PBS -o 'qsub.out'
#PBS -e 'qsub.err'
#PBS -W umask=007
#PBS -q low_priority
#PBS -l nodes=1:ppn=4
#PBS -m bea

: Change the current working directory to the directory from which you ran qsub:
cd $PBS_O_WORKDIR
: Run IDL and tell it to execute the "main" procedure:
idl -e main
Above is an example to submit an IDL job into the query using PBS.
For more info about PBS, go to
http://en.wikipedia.org/wiki/Portable_Batch_System
An explicit example of PBS script could be found at:
http://www.rcac.purdue.edu/userinfo/resources/rossmann/userguide.cfm#run
Also, the readers are supposed to have obtained some basic Linux skills.

Several features of HPC are flexible data storage capacity. When I was conducting my job, I usually do not want to place massive data with my program code. And I also benefit a lot from third party library including the NASA IDL Library. So I want to automatically import those libraries when I submit my IDL job as well.

Here is what I did:
Firstly, add a single line into the bash profile, which define the IDL_STARTUP environment
For different type of shell, the file you want to add the above line might be different, but the principle is the same.
Then, go to the path you specified and create you startup file. A simple example of mine is as follow:




The most important line within this startup file is the !PATH variable, which include the path where I put all my third party library.
Some mistakes you want to avoid is the absolute path and relative path issue. You can't be too sure those library are really included even you see the 'IDL is awesome' comes out when IDL is launched! One simple to examine is to start the IDLDE to see whether any of the library routines is highlighted when used.


A common error still remains that IDL can not find the routine even you thought it is right there!
The solution is that try to load the routing into IDL and compile it. If the compile gets passed, mostly you will be able to call it now.

For some reason some library are written for years. So as new IDL is releases, some may not work and need some updates if necessary. And that is why you can't call them now but they still work yesterday!


After these steps, you will be able to call any routine from IDL, as long as the library is placed under the path.
One more aspect needs to be considered is that IDL may not have the authority to access or write data across different directory, even you can see the result when debugging. The most reliable way is to operate on directory under corresponding HPC.
Thanks for the help from Michael Galloy and ENVI/IDL support team.

Comments

Popular posts from this blog

Spatial datasets operations: mask raster using region of interest

Climate change related studies usually involve spatial datasets extraction from a larger domain.
In this article, I will briefly discuss some potential issues and solutions.

In the most common scenario, we need to extract a raster file using a polygon based shapefile. And I will focus as an example.

In a typical desktop application such as ArcMap or ENVI, this is usually done with a tool called clip or extract using mask or ROI.

Before any analysis can be done, it is the best practice to project all datasets into the same projection.

If you are lucky enough, you may find that the polygon you will use actually matches up with the raster grid perfectly. But it rarely happens unless you created the shapefile using "fishnet" or other approaches.

What if luck is not with you? The algorithm within these tool usually will make the best estimate of the value based on the location. The nearest re-sample, but not limited to, will be used to calculate the value. But what about the outp…

Numerical simulation: ode/pde solver and spin-up

For Earth Science model development, I inevitably have to deal with ODE and PDE equations. I also have come across some discussion related to this topic, i.e.,

https://www.researchgate.net/post/What_does_one_mean_by_Model_Spin_Up_Time

In an attempt to answer this question, as well as redefine the problem I am dealing with, I decided to organize some materials to illustrate our current state on this topic.

Models are essentially equations. In Earth Science, these equations are usually ODE or PDE. So I want to discuss this from a mathematical perspective.

Ideally, we want to solve these ODE/PDE with initial condition (IC) and boundary condition (BC) using various numerical methods.
https://en.wikipedia.org/wiki/Initial_value_problem
https://en.wikipedia.org/wiki/Boundary_value_problem

Because of the nature of geology, everything is similar to its neighbors. So we can construct a system of equations which may have multiple equation for each single grid cell. Now we have an array of equation…

Watershed Delineation On A Hexagonal Mesh Grid: Part A

One of our recent publications is "Watershed Delineation On A Hexagonal Mesh Grid" published on Environmental Modeling and Software (link).
Here I want to provide some behind the scene details of this study.

(The figures are high resolution, you might need to zoom in to view.)

First, I'd like to introduce the motivation of this work. Many of us including me have done lots of watershed/catchment hydrology modeling. For example, one of my recent publications is a three-dimensional carbon-water cycle modeling work (link), which uses lots of watershed hydrology algorithms.
In principle, watershed hydrology should be applied to large spatial domain, even global scale. But why no one is doing it?  I will use the popular USDA SWAT model as an example. Why no one is setting up a SWAT model globally? 
There are several reasons we cannot use SWAT at global scale: We cannot produce a global DEM with a desired map projection. SWAT model relies on stream network, which depends on DEM.…