The Question:
What method will users employ (when submitting via globus) to gain access to the executable they require using Modules?
Requirements
User
- Want to specify module requests via RSL
- Want to use modules via submitting a script
Site
- Qsub command (
qsub < pbs.pm.built.jobscript) must resolve (site based) the "module" call
The rest, they say, is implementation
Definition
User
- RSL submittion:
- The user is providing the module name via the rsl extension.
- The user is using the following format.
<module>gcc</module>
<module>snark/stable</module>
<module>intel-cc/8.0.123</module>
- Script submittion
- Each site must support 1(or more) script submittions. the
module command must resolve in at least one shell the user uses.
Site
- Site Qsub setup
- There must exist a workable example of
qsub < myscript in which myscript contains the module command.
Implementation:
The two possible methods being compared here are
Method 1. Use the globus Job Description Extensions Support
http://www.globus.org/toolkit/docs/4.0/execution/wsgram/admin-index.html#s-wsgram-admin-extensions
Method 2. Attempt to provide the same environment a user gets when logging in via ssh (for example) by using the
-l or
--login argument to the shell.
There are 3 sub-methods to this solution:
- Modification to pbs.pm
- Creation/alteration of the PBS prologue
- Just get the users to do it themselves
Implementation notes:
Method 1. Using the Job Description Extensions (JDE) requires installing additional globus components and writing a perl module to process the xml in the extension (see the above url for details).
Update: It seems as though you don't have to write the perl module if you're using simple xml constructs.
The changes the end user sees would be a job description similar to the following
<job>
<executable>myprog</executable>
<directory>${GLOBUS_USER_HOME}</directory>
<extensions>
<module>myprog/1.0</module>
</extensions>
</job>
Method 2. This solution can be done in multiple places. The suggested/preferred methods are either options a or b.
In a nutshell, the solution is to append
-l or
--login at the end of the
#! line in the script that gets executed on the compute node.
Using method 2c the user would submit a script similar to the following
#!/bin/sh -l
module load myprog/1.0
myprog
Using either of options a or b allows the
-l to be appended without the user specifying. A patch for the default pbs.pm to accomplish this is (method 2a)
--- pbs.pm.orig 2006-06-15 16:07:06.000000000 +1000
+++ pbs.pm.patched 2006-06-19 15:37:29.000000000 +1000
@@ -134,7 +134,7 @@
local(*JOB);
open( JOB, '>' . $pbs_job_script_name );
print JOB<<"EOF";
-#! /bin/sh
+#! /bin/sh -l
# PBS batch job script built by Globus job manager
#
#PBS -S /bin/sh
An example of method 2b (which is an executable file named
$PBS_HOME/mom_priv/prologue) is
#!/bin/bash
JOB_NUM=$(echo $1 | sed -e "s/\..*//")
if FILE=$(ls jobs/${JOB_NUM}.*.SC 2>/dev/null)
then
if [ -r $FILE ]; then
SHELL=$(head -1 $FILE | sed 's/#!\([^ ]*\)/\1/')
case $SHELL in
*ash|*sh)
perl -pi -e "s|^(#!$SHELL)(.*)|\1 -l\2|" $FILE
;;
*csh)
perl -pi -e "s|^(#!$SHELL).*|\1 -l|" $FILE
;;
esac
fi
fi
You could also check that the job begins with
Grid_ but due to the naming scheme of torque that might not be reliable.
Pros and Cons:
Method 1 Globus Job Description Extensions Support
PROS:
- allows a user to specify a binary executable on the remote host for execution without providing a wrapper script
- makes the actual module loading mechanism transparent to the end user
CONS:
- requires additional components that aren't release with globus (aren't fully supported either) that may be troublesome to install (recall email extensions)
- doesn't give the user the same experience they would get if they login via ssh (thus requires users potentially remember two ways to get modules loaded)
- required us to define the XML - we have trouble defining anything
- not yet implemented
Method 2 Appending -l to the shell that executes the job
PROS:
- provides the same environment that a user would get if logging in via ssh (only one thing to remember for users)
- choice of implementations
- already implemented
CONS:
- potential for problems if login/startup scripts require interaction
- requires login shell to provide the
module command to the users (is most likely implemented everywhere already)
PRO and CON (depends on your view, and the prologue you implement)
- if implemented as PBS prologue then all PBS jobs can automatically get the benefits of the
module command
Comments
- The changes in pbs.pm to support Method 1 are trivial, and have been tested at VPAC
- Methods 2a and 2b are attractive from a user perspective in that there are no requirements for him to use Globus extensions which might not be implemented at other sites
- There may be a case for implementing both Methods 1 and 2
--
AndrewSharpe - 19 Jun 2006