sge_execd (8) - Linux Manuals
sge_execd: xxQS_NAMExx job execution agent and load sensor interface
NAME
xxqs_name_sxx_execd, xxqs_name_sxx_loadsensor - xxQS_NAMExx job execution agent and load sensor interface
SYNOPSIS
xxqs_name_sxx_execd [ -help ]DESCRIPTION
xxqs_name_sxx_execd controls the xxQS_NAMExx queues local to the machine on which xxqs_name_sxx_execd is running and executes/controls the jobs sent from to be run on these queues via or the shepherd_cmd of
OPTIONS
-help
Prints a listing of all options.LOAD SENSORS
One of more load sensors may be configured for xxqs_name_sxx_execd via the possibilities listed in the global host configuration, the execution-host-specific cluster configuration, the default qloadsensor, or qidle (when USE_QIDLE is set - see The executable path of the load sensor is invoked by xxqs_name_sxx_execd on a regular basis (governed by load_report_time in and delivers one or multiple load figures for the execution host (e.g. users currently logged in) or the complete cluster (e.g. free disk space on a network wide scratch file system). A load sensor may be a script or a binary executable. In either case its handling of the STDIN and STDOUT streams and its control flow must comply with the following rules. Load sensors are restarted if their modification time changes or they are killed. If they read a configuration file, for instance, killing will be necessary to pick up modifications to it unless the sensor will re-read a modified version.Load sensor interface
The load sensor must be written as an infinite loop, waiting at a certain point for input from STDIN. If the string "quit" is read from STDIN, the load sensor should exit. When an end-of-line is read from STDIN, a load data retrieval cycle should start. The load sensor then performs whatever operation is necessary to compute the desired load figures. At the end of the cycle the load sensor writes the result to stdout. The format is as follows:- •
- A load value report starts with a line containing only either the word "start" or the word "begin".
- •
- Individual load values are separated by newlines.
- •
- Each load value report consists of three parts separated by colons (":") and containing no blanks.
- •
- The first part of a load value information is either the name of the host for which load is reported or the special name "global".
- •
- The second part is the symbolic name of the load value as defined in the host or global complex list (see for details). This must be the full name of the complex, not the shortcut name. If a load value is reported for which no entry in the host or global complex list exists, the reported load value is not used.
- •
- The third part is the measured load value.
- •
- A load value report ends with a line with only the word "end".
NB. If the runtime of the language in which the sensor is written buffers the output (e.g. Perl), ensure it is flushed on each iteration.
ENVIRONMENT VARIABLES
- xxQS_NAME_Sxx_ROOT
- Specifies the location of the xxQS_NAMExx standard configuration files.
- xxQS_NAME_Sxx_CELL
-
If set, specifies the default xxQS_NAMExx cell. To address a xxQS_NAMExx
cell
xxqs_name_sxx_execd
uses (in the order of precedence):
-
-
The name of the cell specified in the environment
variable xxQS_NAME_Sxx_CELL, if it is set.
The name of the default cell, i.e. default.
-
The name of the cell specified in the environment
variable xxQS_NAME_Sxx_CELL, if it is set.
-
- xxQS_NAME_Sxx_DEBUG_LEVEL
- If set, specifies that debug information should be written to stderr. In addition, the level of detail in which debug information is generated is defined.
- xxQS_NAME_Sxx_QMASTER_PORT
- If set, specifies the tcp port on which is expected to listen for communication requests. Most installations will use a services map entry for the service "sge_qmaster" instead to define that port.
- xxQS_NAME_Sxx_EXECD_PORT
- If set, specifies the tcp port on which is expected to listen for communication requests. Most installations will use a services map entry for the service "sge_execd" instead to define that port.
- SGE_ND
- If set, don't daemonize the program (for debugging).
- SGE_ENABLE_COREDUMP
-
If set, enable core dumps on Linux when the admin_user is not root.
Linux normally disables core dumps when the daemon has changed uid or
gid. Setting SGE_ENABLE_COREDUMP in xxqs_name_sxx_execd's environment
defeats that to enable core dumps for debugging if they are otherwise
allowed. This is typically not a big hazard with xxQS_NAME_Sxx, since
most information is exposed in the spool area anyhow. Dumps will
appear in the qmaster spool directory, which need not be
world-readable.
On Solaris, may be used to enable such dumps. - SGE_EXECD_PIDFILE
- Path name of file to which to write the daemon process id at startup (default "execd.pid"). Note that this must be writable by the admin user (see
RESTRICTIONS
xxqs_name_sxx_execd usually is started from root on each machine in the xxQS_NAMExx pool. If started by a normal user, a spool directory must be used to which the user has read/write access. In this case only jobs being submitted by that same user are handled correctly by the system.FILES
<xxqs_name_sxx_root>/<cell>/common/configuration xxQS_NAMExx global configuration <xxqs_name_sxx_root>/<cell>/common/local_conf/<host> xxQS_NAMExx host specific configuration <xxqs_name_sxx_root>/<cell>/spool/<host> Default execution host spool directory <xxqs_name_sxx_root>/<cell>/common/act_qmaster xxQS_NAMExx master host file <xxqs_name_sxx_root>/bin/<arch>/qloadsensor Default load sensor <xxqs_name_sxx_root>/bin/<arch>/qidle Idle load sensor per USE_QIDLE in execd_params <xxqs_name_sxx_root>/<cell>/common/sgepasswd Password information used on Microsoft Windows hosts. See
COPYRIGHT
See for a full statement of rights and permissions.