The Question:

What method will users employ (when submitting via globus) to gain access to the executable they require using Modules?

Requirements

User

  1. Want to specify module requests via RSL
  2. Want to use modules via submitting a script

Site

  1. Qsub command (qsub < pbs.pm.built.jobscript) must resolve (site based) the "module" call

The rest, they say, is implementation

Definition

User

  1. RSL submittion:
    1. The user is providing the module name via the rsl extension.
    2. The user is using the following format.
      <module>gcc</module>
      <module>snark/stable</module>
      <module>intel-cc/8.0.123</module>
            

  1. Script submittion
    1. Each site must support 1(or more) script submittions. the module command must resolve in at least one shell the user uses.

Site

  1. Site Qsub setup
    1. There must exist a workable example of qsub < myscript in which myscript contains the module command.

Implementation:

The two possible methods being compared here are

Method 1. Use the globus Job Description Extensions Support http://www.globus.org/toolkit/docs/4.0/execution/wsgram/admin-index.html#s-wsgram-admin-extensions

Method 2. Attempt to provide the same environment a user gets when logging in via ssh (for example) by using the -l or --login argument to the shell. There are 3 sub-methods to this solution:

  1. Modification to pbs.pm
  2. Creation/alteration of the PBS prologue
  3. Just get the users to do it themselves

Implementation notes:

Method 1. Using the Job Description Extensions (JDE) requires installing additional globus components and writing a perl module to process the xml in the extension (see the above url for details). Update: It seems as though you don't have to write the perl module if you're using simple xml constructs.

The changes the end user sees would be a job description similar to the following

                <job>
                        <executable>myprog</executable>
                        <directory>${GLOBUS_USER_HOME}</directory>
                        <extensions>
                                <module>myprog/1.0</module>
                        </extensions>
                </job>
Method 2. This solution can be done in multiple places. The suggested/preferred methods are either options a or b. In a nutshell, the solution is to append -l or --login at the end of the #! line in the script that gets executed on the compute node. Using method 2c the user would submit a script similar to the following
                #!/bin/sh -l

                module load myprog/1.0
                myprog
Using either of options a or b allows the -l to be appended without the user specifying. A patch for the default pbs.pm to accomplish this is (method 2a)
                --- pbs.pm.orig 2006-06-15 16:07:06.000000000 +1000
                +++ pbs.pm.patched      2006-06-19 15:37:29.000000000 +1000
                @@ -134,7 +134,7 @@
                        local(*JOB);
                        open( JOB, '>' . $pbs_job_script_name );
                        print JOB<<"EOF";
                -#! /bin/sh
                +#! /bin/sh -l
                 # PBS batch job script built by Globus job manager
                 #
                 #PBS -S /bin/sh
An example of method 2b (which is an executable file named $PBS_HOME/mom_priv/prologue) is
                #!/bin/bash

                JOB_NUM=$(echo $1 | sed -e "s/\..*//")

                if FILE=$(ls jobs/${JOB_NUM}.*.SC 2>/dev/null)
                then
                        if [ -r $FILE ]; then
                                SHELL=$(head -1 $FILE | sed 's/#!\([^ ]*\)/\1/')

                                case $SHELL in
                                        *ash|*sh)
                                                perl -pi -e "s|^(#!$SHELL)(.*)|\1 -l\2|" $FILE
                                        ;;
                                        *csh)
                                                perl -pi -e "s|^(#!$SHELL).*|\1 -l|" $FILE
                                        ;;
                                esac
                        fi
                fi
You could also check that the job begins with Grid_ but due to the naming scheme of torque that might not be reliable.

Pros and Cons:

Method 1 Globus Job Description Extensions Support

PROS:

  • allows a user to specify a binary executable on the remote host for execution without providing a wrapper script
  • makes the actual module loading mechanism transparent to the end user

CONS:

  • requires additional components that aren't release with globus (aren't fully supported either) that may be troublesome to install (recall email extensions)
  • doesn't give the user the same experience they would get if they login via ssh (thus requires users potentially remember two ways to get modules loaded)
  • required us to define the XML - we have trouble defining anything
  • not yet implemented

Method 2 Appending -l to the shell that executes the job

PROS:

  • provides the same environment that a user would get if logging in via ssh (only one thing to remember for users)
  • choice of implementations
  • already implemented

CONS:

  • potential for problems if login/startup scripts require interaction
  • requires login shell to provide the module command to the users (is most likely implemented everywhere already)

PRO and CON (depends on your view, and the prologue you implement)

  • if implemented as PBS prologue then all PBS jobs can automatically get the benefits of the module command

Comments

  • The changes in pbs.pm to support Method 1 are trivial, and have been tested at VPAC
  • Methods 2a and 2b are attractive from a user perspective in that there are no requirements for him to use Globus extensions which might not be implemented at other sites
  • There may be a case for implementing both Methods 1 and 2

-- AndrewSharpe - 19 Jun 2006

Topic revision: r40 - 03 Jul 2006 - 06:31:26 - TerryRankine
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback