Rappture Integration with Submit

Version 5
by (unknown)
Version 6
by (unknown)

Deletions or items before changed

Additions or items after changed

1 == Overview ==
2
3 It is possible to use the submit command to execute simulation jobs generated by Rappture interfaces remotely. A common approach is to create a shell script which can exec'd or forked from an application wrapper script. This approach has been applied to TCL, Python, Perl wrapper scripts. To avoid consumption of large quantities of remote resources it is imperative that the submit command be terminated when directed to do so by the application user (Abort button).
4
5 {{{

}}}
6
7 === Python Wrapper Script ===
8
9 Submit can be called from a python Rappture wrapper script for remote batch job submission. An example of what code to insert in the wrapper script is detailed here.
10
11 An initial code segment is required to catch the Abort button interrupt.
12
13 {{{
14 import os
15 import sys
16 import stat
17 import Rappture
18 import signal
19 import re
20
21 def sig_handler(signal, frame):
22 if Rappture.tools.commandPid > 0:
23 os.kill(Rappture.tools.commandPid,signal.SIGTERM)
24
25 signal.signal(signal.SIGINT, sig_handler)
26 signal.signal(signal.SIGHUP, sig_handler)
27 signal.signal(signal.SIGQUIT, sig_handler)
28 signal.signal(signal.SIGABRT, sig_handler)
29 signal.signal(signal.SIGTERM, sig_handler)
30 }}}
31
32 A second code segment is used to build an executable script that can executed using Rappture.tools.getCommandOutput. The trap statement will catch the interrupt thrown when the wrapper script execution is Aborted. Putting the submit command in the background allows for the possibility of issuing multiple submit commands from the script. The wait statement forces the shell script to wait for all submit commands to terminate before exiting.
33
34 {{{
35 submitScriptName = 'submit_app.sh'
36 submitScript = """#!/bin/sh
37
38 trap cleanup HUP INT QUIT ABRT TERM
39
40 cleanup()
41 {
42 echo "Abnormal termination by signal"
43 kill -s TERM `jobs -p`
44 exit 1
45 }
46
47 """
48
49 submitScript += "submit -v u2-grid python foo.py -i bar.in"
50 submitScript += "\nwait"
51
52 submitScriptPath = os.path.join(os.getcwd(),submitScriptName)
53
54 fp = open(submitScriptPath,'w')
55 if fp:
56 fp.write(submitScript)
57 fp.close()
58
59 os.chmod(submitScriptPath,
60 stat.S_IRWXU|stat.S_IRGRP|stat.S_IXGRP|stat.S_IROTH|stat.S_IXOTH)
61 }}}
62
63 In the previous piece of code you must edit the following line to accompany what file you want to be remotely executed (e.g. foo.py),any input files you may need (e.g. bar.in), and the grid you want to run it on (e.g. u2-grid):
64
65 {{{
66 submitScript += "submit -v u2-grid python foo.py -i bar.in"
67 }}}
68 +
69 +
Also when running this script on vhub you must make sure to include the path of files if they are not in your {{{<}}}tool{{{>}}}/bin directory. This can be done by using the 'TOOLDIR' environment variable that holds the tool directory in /apps
70
71 The standard method for wrapper script execution of commands can now be used. This will stream the output from all submit commands contained in submit_script.sh to the GUI display. The same output will be retained in the variable stdOutput.
72
73 {{{
74 exitStatus,stdOutput,stdError = Rappture.tools.getCommandOutput(submitScriptPath)
75 }}}
76
77 Each submit command creates files to hold COMMAND standard output and standard error. The file names are of the form JOBID.stdout and JOBID.stderr, where JOBID is an 8 digit number. These results can be gathered as follows.
78
79 {{{
80 re_stdout = re.compile(".*.stdout$")
81 re_stderr = re.compile(".*.stderr$")
82
83 out2 = ""
84 errFiles = filter(re_stderr.search,os.listdir(os.getcwd()))
85 if errFiles != []:
86 for errFile in errFiles:
87 errFilePath = os.path.join(os.getcwd(),errFile)
88 if os.path.getsize(errFilePath) > 0:
89 f = open(errFilePath,'r')
90 outFileLines = f.readlines()
91 f.close()
92 stderror = ''.join(outFileLines)
93 out2 += 'n' + stderror
94 os.remove(errFilePath)
95
96 outFiles = filter(re_stdout.search,os.listdir(os.getcwd()))
97 if outFiles != []:
98 for outFile in outFiles:
99 outFilePath = os.path.join(os.getcwd(),outFile)
100 if os.path.getsize(outFilePath) > 0:
101 f = open(outFilePath,'r')
102 outFileLines = f.readlines()
103 f.close()
104 stdoutput = ''.join(outFileLines)
105 out2 += 'n' + stdoutput
106 os.remove(outFilePath)
107 }}}
108
109 The script file should be removed.
110
111 {{{
112 os.remove(submitScriptPath)
113 }}}
114
115 The output is presented as the job output log.
116
117 {{{
118 lib.put("output.log", out2, append=1)
119 }}}
120
121 All other result processing can proceed as normal.
122
123 A complete file of the following code maybe downloaded here: [[File(submit.py)]]
124
125 {{{

}}}
126
127 === Notes ===
128
129 If the file that gets called remotely writes to a file (e.g. foo.out) and you want to open that file once the remote file is done executing you must first get the path to the output file:
130
131 Instead of this:
132
133 {{{
134 output = open('foo.out', 'r')
135 }}}
136
137 You must instead open it by preceding the file name with the path:
138
139 {{{
140 outputName = 'foo.out'
141 outputPath = os.path.join(os.getcwd(),outputName)
142
143 output = open(outputPath, 'r')
144 }}}
145
146 {{{
}}}
147
148 You can get help with the submit command by using the {{{--help}}} option
149
150 {{{
151 #> submit --help
152 Usage: submit [options]
153
154 Options:
155 -v, --venue Remote job destination
156 -i, --inputfile Input file
157 -n NCPUS, --nCpus=NCPUS
158 Number of processors for MPI execution
159 -N PPN, --ppn=PPN Number of processors/node for MPI execution
160 -w WALLTIME, --wallTime=WALLTIME
161 Estimated walltime hh:mm:ss or minutes
162 -e, --env Variable=value
163 -m, --manager Multiprocessor job manager
164 -M, --metrics Report resource usage on exit
165 -W, --wait Wait for reduced job load before submission
166 -h, --help Report command usage
167
168 Currently available DESTINATIONs are:
169 u2-grid
170
171 Currently available MANAGERs are:
172 ccni-bgl-CO
173 ccni-bgl-VN
174 ccni-opteron_lammps
175 mpi
176 parallel
177 sbbnl-bgl-CO
178 sbbnl-bgl-VN
179 sbbnl-bgp-DUAL
180 sbbnl-bgp-SMP
181 sbbnl-bgp-VN
182 }}}
183
184 {{{
}}}
185
186 For more information please visit: [https://hubzero.org/documentation/0.9.0/tooldevs/grid.rappture_submit HUBzero Submit Documentation]