Condor-Web Services
Plugin for GridSAM
We at the Department of
Computer Science at University College London have been working in
collaboration with the Condor team at the
Web Service interfaces have
now been incorporated into all Condor daemons. The Schedd and Collector
interfaces provide the core functionality that is available through the command
line tools. As a means of demonstrating the functionality available through the
Web Service interfaces, we developed a plugin for GridSAM, which is a Web
Service based job submission and monitoring tool that interacts with a number
of resource managers including Condor. Our plugin was designed to interact with
Condor through the Web Service interfaces as an alternative to the existing
plugin which accessed Condor through the command line tools. We were able to
incorporate some interesting functionality such as the ability to submit to
multiple Schedds which allows a simple form of load balancing as well as the
possibility of resource federation. The plugin also includes the ability to
submit to Condor-G. The full details of our work on this project is described
in the Condor Birdbath All Hands paper.
Installation
Condor
The Web Services enabled
Condor daemons are available in Condor
version 6.7.5 and onwards. Details of these daemons can be found on the Condor BirdBath site. On
some versions, the Schedd and Collector WSDL interface definitions are not
included in the installation and must be downloaded separately from the Condor BirdBath site. These
must then be copied into a directory in your Condor release directory such as ${RELEASE_DIR}/web.
In all versions of Condor,
the Web Service interfaces must be enabled by adding the following attributes
to the condor_config file:
WEB_ROOT_DIR =
$RELEASE_DIR/web
ENABLE_SOAP = TRUE
ENABLE_WEB_SERVER =
TRUE
GridSAM
The latest version of the
plugin is compatible with the release of GridSAM bundled with OMII_2.0.0 which uses Apache Axis-1.2 RC3. Install the OMII
server stack which is compatible with Redhat Enterprise Linux 3.0 ES and SuSE
9.0. Then proceed to install the Managed Programme components which will
install GridSAM on the OMII server. The latest builds of GridSAM can be
downloaded from the GridSAM
homepage, however these may be incompatible with the plugin.
Plugin
Once you have installed the
GridSAM server, copy the plugin JAR into the ${TOMCAT_HOME}/webapps/gridsam/WEB_INF/lib/ directory. Copy an
appropriately configured version of the jobmanager.xml file into the ${TOMCAT_HOME}/webapps/gridsam/WEB_INF/classes/ directory. The
plugin should now be ready for use.
JSDL
We have attempted to
reproduce the functionality of the standard Condor plugin so that both plugins
can be used interchangeably without having to modify JSDL. The JSDL supported
by the standard plugin is described here. In
addition to the functionality available in the standard plugin, we have
implemented an interpretation of the jsdl-posix:WorkingDirectory
element. The standard plugin creates a temporary working directory for each job
where files can be staged in to. In our experience, the ability to specify a
working directory is particularly useful when submitting from the same host as
the GridSAM server, i.e. when client and server share a common file system and
data staging is not necessary. Users can simply specify a working directory and
all the input files can be specified relative to this directory. If jsdl:Source is not specified in a jsdl:DataStaging element, it is assumed
that the file is available on the local file system and the data staging must
occur solely between the GridSAM server and the Schedd. In any case, jsdl:DataStaging elements must be
present to describe all input files including the executable so that the plugin
is aware of the files that need to be staged on to the remote Schedd.
Jobmanager Configuration
The plugin can be configured
through the jobmanager.xml
file. An example is shown below to illustrate the various configuration points
available.
<?xml version="1.0"
encoding="UTF-8"?> <module id="jobmanager.ssh"
version="1.0.0"> <!--
dependent modules -->
<sub-module descriptor="org/icenigrid/gridsam/resource/config/common.xml"/>
<sub-module
descriptor="org/icenigrid/gridsam/resource/config/shell.xml"/>
<sub-module
descriptor="org/icenigrid/gridsam/resource/config/condor-wsbased.xml"/>
<sub-module descriptor="org/icenigrid/gridsam/resource/config/embedded.xml"/>
<sub-module descriptor="database.xml"/> <!--
override the factory defaults here -->
<contribution
configuration-id="hivemind.ApplicationDefaults">
<default symbol="condor.AttachmentSize" value="500"/>
<default symbol="condor.PollingInterval"
value="10"/>
</contribution> <!—-
specify Schedd details -->
<contribution configuration-id="condor.ScheddConfig">
<Schedd hostname="fried.cs.ucl.ac.uk"
port="3408" globusResource="lake.esc.cam.ac.uk/jobmanager-pbs">
<Attributes name="x509userproxysubject"
type="STRING-ATTR"
value="/C=UK/O=eScience/OU=UCL/L=EISD/CN=some body"/>
<Attributes name="x509userproxy"
type="STRING-ATTR" value="/tmp/x509up_u500"/>
<Attributes name="GlobusRSL" type="STRING-ATTR"
value="(job_type=single)"/>
</Schedd>
<Schedd hostname="kotturoti.cs.ucl.ac.uk"
collectorHostname="medoc.geol.ucl.ac.uk">
<Requirements requirements="(OpSys=="WINNT51")"
forceRequirements="true"/>
<Attributes name="NTDomain" type="STRING-ATTR"
value="CS"/>
</Schedd>
</contribution> </module> |
condor.AttachmentSize |
The SOAP attachment size (in
KB) to be used by Condor to transfer files to and from the remote Schedd. |
condor.PollingInterval |
The interval (in seconds)
between polling requests. The plugin relies on regular polling to query the
status of GridSAM jobs on remote Schedds. |
|
Multiplicity |
Type |
Description |
·
Schedd |
[0..*] |
|
|
§
Hostname |
[1] |
String |
The hostname of the remote
Schedd. |
§
Port |
[0..1] |
Integer |
The port on which the
Schedd is running. If this is not specified, for example if the Schedd port
is dynamically allocated, CollectorHostname must be specified so that the
Schedd port can be discovered through the Collector. |
§
CollectorHostname |
[0..1] |
String |
The hostname of the
Collector daemon of the pool that this Schedd belongs to. This is used in
case a Schedd port number is not provided or if the provided port number is
invalid. The Collector can be queried to obtain the Schedd port number. |
§
GlobusResource |
[0..1] |
String |
A URL to a Globus
jobmanager so that this Schedd can act as a Condor-G client. |
§
Requirements |
[0..1] |
|
|
-
Requirements |
[1] |
String |
A Condor requirements
string. |
-
ForceRequirements |
[0..1] |
Boolean [true|false] |
If set to TRUE, Schedd
specific requirements and the job requirements from JSDL will be merged with
the AND operator which provides strict requirements enforcement. We envisage
a scenario where an administrator may wish to restrict the type of jobs that
are allowed to run on a particular pool based on the job requirements. If this
attribute is set to FALSE, Schedd specific requirements and the job
requirements from JSDL will be merged with the OR operator which would serve
as a guidance to submitters who may not be aware of the nature of the
resources they are submitting to. Hence if the job requirements can not be
satisfied by the pool, an attempt will still be made to schedule and execute
the job on available resources. Default is FALSE. |
§
Attributes |
[0..*] |
|
|
-
Name |
[1] |
String |
Attribute Name. |
-
Type |
[1] |
[INTEGER-ATTR| FLOAT-ATTR| STRING-ATTR|
EXPRESSION-ATTR| BOOLEAN-ATTR| UNDEFINED-ATTR| ERROR-ATTR] |
Attribute Type. |
-
Value |
[1] |
String |
Attribute Value. |
Condor-G Submission
The plugin allows users to
submit jobs to Globus via Condor-G. In the jobmanager, specific Schedds can be
assigned to be a Condor-G client by setting the globusResource attribute. This configures the job ClassAd with the
appropriate Condor-G attributes. However a few additional attributes have to be
set manually as they are specific to each user/job. These are:
§
x509userproxysubject
§
x509userproxy
§
GlobusRSL
They can be set through the
jobmanager as described above. An example is shown below:
<Attributes
name="x509userproxysubject" type="STRING-ATTR" value="/C=UK/O=eScience/OU=UCL/L=EISD/CN=some
body"/>
<Attributes name="x509userproxy"
type="STRING-ATTR" value="/tmp/x509up_u500"/>
<Attributes name="GlobusRSL"
type="STRING-ATTR" value="(job_type=single)"/>
Downloads