Trigger an event when the primary slurmctld fails.
Trigger an event when the primary slurmctld resuming operation after failure.
Trigger an event when primary slurmctld resumes control.
Trigger an event when a BlueGene block enters an ERROR state.
Trigger an event when the backup slurmctld fails.
Trigger an event when the backup slurmctld resumes operation after failure.
Trigger event when backup slurmctld assumes control.
Clear or delete a previously defined event trigger.
The --id, --jobid or --user
option must be specified to identify the trigger(s) to
Only user root or the triggers creator can delete a trigger.
Trigger an event if the specified node goes into a DOWN state.
Trigger an event if the specified node goes into a DRAINED state.
Trigger an event when primary slurmctld accounting buffer is full.
Trigger an event if the specified node goes into a FAILING state.
Trigger an event when the specified job completes execution.
Associate flags with the reservation. Multiple flags should be comma separated.
Valid flags include:
Trigger events based upon changes in state of front end nodes rather than
compute nodes. Applies to BlueGene and Cray architectures only, where the
slurmd daemon executes on front end nodes rather than the compute nodes.
Use this option with either the --up or --down option.
Trigger an event when the primary slurmdbd fails.
Trigger an event when the primary slurmdbd resumes operation after failure.
Show registered event triggers.
Options can be used for filtering purposes.
Trigger an event when the primary database fails.
Trigger an event when the primary database resumes operation after failure.
Trigger ID number.
Trigger an event if the specified node remains in an IDLE state
for at least the time period specified by the --offset
option. This can be useful to hibernate a node that remains idle,
thus reducing power consumption.
Job ID of interest.
NOTE: The --jobid option can not be used in conjunction
with the --node option. When the --jobid option is
used in conjunction with the --up or --down option,
all nodes allocated to that job will considered the nodes used as a
Clusters to issue commands to.
Host name(s) of interest.
By default, all nodes associated with the job (if --jobid
is specified) or on the system are considered for event triggers.
NOTE: The --node option can not be used in conjunction
with the --jobid option. When the --jobid option is
used in conjunction with the --up, --down or
all nodes allocated to that job will considered the nodes used as a
trigger event. Since this options argument is optional, for proper
parsing the single letter option must be followed immediately with
the value and not include a space between them. For example "-ntux"
and not "-n tux".
Do not print the header when displaying a list of triggers.
The specified action should follow the event by this time interval.
Specify a negative value if action should preceded the event.
The default value is zero if no --offset option is specified.
The resolution of this time is about 20 seconds, so to execute
a script not less than five minutes prior to a job reaching its
time limit, specify --offset=320 (5 minutes plus 20 seconds).
Execute the program at the specified fully qualified pathname
when the event occurs.
You may quote the path and include extra program arguments if desired.
The program will be executed as the user who sets the trigger.
If the program fails to terminate within 5 minutes, it will
be killed along with any spawned processes.
Do not report non-fatal errors.
This can be useful to clear triggers which may have already been purged.
Trigger an event when the system configuration changes.
This is triggered when the slurmctld daemon reads its configuration file or
when a node state changes.
Register an event trigger based upon the supplied options.
NOTE: An event is only triggered once. A new event trigger
must be set established for future events of the same type
to be processed.
Triggers can only be set if the command is run by the user
SlurmUser unless SlurmUser is configured as user root.
Trigger an event when the specified jobs time limit is reached.
This must be used in conjunction with the --jobid option.
Trigger an event if the specified node is returned to service
from a DOWN state.
Clear or get triggers created by the specified user.
For example, a trigger created by user root for a job created by user
adam could be cleared with an option --user=root.
Specify either a user name or user ID.
Print detailed event logging. This includes time-stamps on data structures,
record counts, etc.
|-V , --version|
Print version information and exit.
TRIG_ID Trigger ID number.
RES_TYPE Resource type: job or node
RES_ID Resource ID: job ID or host names or "*" for any host
TYPE Trigger type: time or fini (for jobs only), down or up (for jobs or nodes), or drained, idle or reconfig (for nodes only)
OFFSET Time offset in seconds. Negative numbers indicated the action should occur before the event (if possible)
USER Name of the user requesting the action
PROGRAM Pathname of the program to execute when the event occurs
Some strigger options may be set via environment variables. These environment variables, along with their corresponding options, are listed below. (Note: commandline options will always override these settings)
SLURM_CONF The location of the Slurm configuration file.
Execute the program "/usr/sbin/primary_slurmctld_failure" whenever the primary slurmctld fails.
> cat /usr/sbin/primary_slurmctld_failure #!/bin/bash # Submit trigger for next primary slurmctld failure event strigger --set --primary_slurmctld_failure \ --program=/usr/sbin/primary_slurmctld_failure # Notify the administrator of the failure using by e-mail /bin/mail email@example.com -s Primary_SLURMCTLD_FAILURE
> strigger --set --primary_slurmctld_failure \ --program=/usr/sbin/primary_slurmctld_failure
Execute the program "/usr/sbin/slurm_admin_notify" whenever any node in the cluster goes down. The subject line will include the node names which have entered the down state (passed as an argument to the script by Slurm).
> cat /usr/sbin/slurm_admin_notify #!/bin/bash # Submit trigger for next event strigger --set --node --down \ --program=/usr/sbin/slurm_admin_notify # Notify administrator using by e-mail /bin/mail firstname.lastname@example.org -s NodesDown:$*
> strigger --set --node --down \ --program=/usr/sbin/slurm_admin_notify
Execute the program "/usr/sbin/slurm_suspend_node" whenever any node in the cluster remains in the idle state for at least 600 seconds.
> strigger --set --node --idle --offset=600 \ --program=/usr/sbin/slurm_suspend_node
Execute the program "/home/joe/clean_up" when job 1234 is within 10 minutes of reaching its time limit.
> strigger --set --jobid=1234 --time --offset=-600 \ --program=/home/joe/clean_up
Execute the program "/home/joe/node_died" when any node allocated to job 1234 enters the DOWN state.
> strigger --set --jobid=1234 --down \ --program=/home/joe/node_died
Show all triggers associated with job 1235.
> strigger --get --jobid=1235 TRIG_ID RES_TYPE RES_ID TYPE OFFSET USER PROGRAM 123 job 1235 time -600 joe /home/bob/clean_up 125 job 1235 down 0 joe /home/bob/node_died
Delete event trigger 125.
> strigger --clear --id=125
Execute /home/joe/job_fini upon completion of job 1237.
> strigger --set --jobid=1237 --fini --program=/home/joe/job_fini
Copyright (C) 2007 The Regents of the University of California. Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
Copyright (C) 2008-2010 Lawrence Livermore National Security.
Copyright (C) 2010-2013 SchedMD LLC.
This file is part of Slurm, a resource management program. For details, see <http://slurm.schedmd.com/>.
Slurm is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
Slurm is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
scontrol(1), sinfo(1), squeue(1)
|April 2015||STRIGGER (1)||Slurm Commands|