sinfo - View information about Slurm nodes and partitions.
sinfo [
OPTIONS...]
sinfo is used to view partition and node information for a system running
Slurm.
-
-a, --all
- Display information about all partitions. This causes
information to be displayed about partitions that are configured as hidden
and partitions that are unavailable to the user's group.
-
-
-M, --clusters=<string>
- Clusters to issue commands to. Multiple cluster names may
be comma separated. A value of ' all' will query all clusters. Note
that the SlurmDBD must be up for this option to work properly. This
option implicitly sets the --local option.
-
-
-d, --dead
- If set, only report state information for non-responding
(dead) nodes.
-
-
-e, --exact
- If set, do not group node information on multiple nodes
unless their configurations to be reported are identical. Otherwise cpu
count, memory size, and disk space for nodes will be listed with the
minimum value followed by a "+" for nodes with the same
partition and state (e.g. "250+").
-
- --federation
- Show all partitions from the federation if a member of
one.
-
-
-o,
--format=<output_format>
- Specify the information to be displayed using an
sinfo format string. If the command is executed in a federated
cluster environment and information about more than one cluster is to be
displayed and the -h, --noheader option is used, then the cluster
name will be displayed before the default output formats shown below.
Format strings transparently used by sinfo when running with
various options are:
- default
- "%#P %.5a %.10l %.6D %.6t %N"
-
- --summarize
- "%#P %.5a %.10l %.16F %N"
-
- --long
- "%#P %.5a %.10l %.10s %.4r %.8h %.10g %.6D %.11T
%N"
-
- --Node
- "%#N %.6D %#P %6t"
-
- --long --Node
- "%#N %.6D %#P %.11T %.4c %.8z %.6m %.8d %.6w %.8f
%20E"
-
- --list-reasons
- "%20E %9u %19H %N"
-
- --long --list-reasons
- "%20E %12U %19H %6t %N"
-
In the above format strings, the use of "#" represents the maximum
length of any partition name or node list to be printed. A pass is made
over the records to be printed to establish the size in order to align the
sinfo output, then a second pass is made over the records to print them.
Note that the literal character "#" itself is not a valid field
length specification, but is only used to document this behaviour.
The format of each field is "%[[.]size]type[suffix]"
- size
- Minimum field size. If no size is specified, whatever is
needed to print the information will be used.
-
- .
- Indicates the output should be right justified and size
must be specified. By default output is left justified.
-
- suffix
- Arbitrary string to append to the end of the field.
Valid
type specifications include:
- %all
- Print all fields available for this data type with a
vertical bar separating each field.
-
- %a
- State/availability of a partition.
-
- %A
- Number of nodes by state in the format
"allocated/idle". Do not use this with a node state option
("%t" or "%T") or the different node states will be
placed on separate lines.
-
- %b
- Features currently active on the nodes, also see
%f.
-
- %B
- The max number of CPUs per node available to jobs in the
partition.
-
- %c
- Number of CPUs per node.
-
- %C
- Number of CPUs by state in the format
"allocated/idle/other/total". Do not use this with a node state
option ("%t" or "%T") or the different node states
will be placed on separate lines.
-
- %d
- Size of temporary disk space per node in megabytes.
-
- %D
- Number of nodes.
-
- %e
- The total memory, in MB, currently free on the node as
reported by the OS. This value is for informational use only and is not
used for scheduling.
-
- %E
- The reason a node is unavailable (down, drained, or
draining states).
-
- %f
- Features available the nodes, also see %b.
-
- %F
- Number of nodes by state in the format
"allocated/idle/other/total". Note the use of this format option
with a node state format option ("%t" or "%T") will
result in the different node states being be reported on separate
lines.
-
- %g
- Groups which may use the nodes.
-
- %G
- Generic resources (gres) associated with the nodes.
-
- %h
- Print the OverSubscribe setting for the partition.
-
- %H
- Print the timestamp of the reason a node is
unavailable.
-
- %I
- Partition job priority weighting factor.
-
- %l
- Maximum time for any job in the format
"days-hours:minutes:seconds"
-
- %L
- Default time for any job in the format
"days-hours:minutes:seconds"
-
- %m
- Size of memory per node in megabytes.
-
- %M
- PreemptionMode.
-
- %n
- List of node hostnames.
-
- %N
- List of node names.
-
- %o
- List of node communication addresses.
-
- %O
- CPU load of a node as reported by the OS.
-
- %p
- Partition scheduling tier priority.
-
- %P
- Partition name followed by "*" for the default
partition, also see %R.
-
- %r
- Only user root may initiate jobs, "yes" or
"no".
-
- %R
- Partition name, also see %P.
-
- %s
- Maximum job size in nodes.
-
- %S
- Allowed allocating nodes.
-
- %t
- State of nodes, compact form.
-
- %T
- State of nodes, extended form.
-
- %u
- Print the user name of who set the reason a node is
unavailable.
-
- %U
- Print the user name and uid of who set the reason a node is
unavailable.
-
- %v
- Print the version of the running slurmd daemon.
-
- %V
- Print the cluster name if running in a federation.
-
- %w
- Scheduling weight of the nodes.
-
- %X
- Number of sockets per node.
-
- %Y
- Number of cores per socket.
-
- %Z
- Number of threads per core.
-
- %z
- Extended processor information: number of sockets, cores,
threads (S:C:T) per node.
-
-
-O,
--Format=<output_format>
- Specify the information to be displayed. Also see the -o
<output_format>, --format=<output_format> option
(which supports greater flexibility in formatting, but does not support
access to all fields because we ran out of letters). Requests a comma
separated list of job information to be displayed.
The format of each field is "type[:[.][size][suffix]]"
- size
- The minimum field size. If no size is specified, 20
characters will be allocated to print the information.
-
- .
- Indicates the output should be right justified and size
must be specified. By default, output is left justified.
-
- suffix
- Arbitrary string to append to the end of the field.
Valid
type specifications include:
- All
- Print all fields available in the -o format for this data
type with a vertical bar separating each field.
-
- AllocMem
- Prints the amount of allocated memory on a node.
-
- AllocNodes
- Allowed allocating nodes.
-
- Available
- State/availability of a partition.
-
- Cluster
- Print the cluster name if running in a federation.
-
- Comment
- Comment. (Arbitrary descriptive string)
-
- Cores
- Number of cores per socket.
-
- CPUs
- Number of CPUs per node.
-
- CPUsLoad
- CPU load of a node as reported by the OS.
-
- CPUsState
- Number of CPUs by state in the format
"allocated/idle/other/total". Do not use this with a node state
option ("%t" or "%T") or the different node states
will be placed on separate lines.
-
- DefaultTime
- Default time for any job in the format
"days-hours:minutes:seconds".
-
- Disk
- Size of temporary disk space per node in megabytes.
-
- Extra
- Arbitrary string on the node.
-
- Features
- Features available on the nodes. Also see
features_act.
-
- features_act
- Features currently active on the nodes. Also see
features.
-
- FreeMem
- The total memory, in MB, currently free on the node as
reported by the OS. This value is for informational use only and is not
used for scheduling.
-
- Gres
- Generic resources (gres) associated with the nodes.
-
- GresUsed
- Generic resources (gres) currently in use on the
nodes.
-
- Groups
- Groups which may use the nodes.
-
- MaxCPUsPerNode
- The max number of CPUs per node available to jobs in the
partition.
-
- Memory
- Size of memory per node in megabytes.
-
- NodeAddr
- List of node communication addresses.
-
- NodeAI
- Number of nodes by state in the format
"allocated/idle". Do not use this with a node state option
("%t" or "%T") or the different node states will be
placed on separate lines.
-
- NodeAIOT
- Number of nodes by state in the format
"allocated/idle/other/total". Do not use this with a node state
option ("%t" or "%T") or the different node states
will be placed on separate lines.
-
- NodeHost
- List of node hostnames.
-
- NodeList
- List of node names.
-
- Nodes
- Number of nodes.
-
- OverSubscribe
- Whether jobs may oversubscribe compute resources (e.g.
CPUs).
-
- Partition
- Partition name followed by "*" for the default
partition, also see %R.
-
- PartitionName
- Partition name, also see %P.
-
- Port
- Node TCP port.
-
- PreemptMode
- Preemption mode.
-
- PriorityJobFactor
- Partition factor used by priority/multifactor plugin in
calculating job priority.
-
-
PriorityTier or Priority
- Partition scheduling tier priority.
-
- Reason
- The reason a node is unavailable (down, drained, or
draining states).
-
- Root
- Only user root may initiate jobs, "yes" or
"no".
-
- Size
- Maximum job size in nodes.
-
- SocketCoreThread
- Extended processor information: number of sockets, cores,
threads (S:C:T) per node.
-
- Sockets
- Number of sockets per node.
-
- StateCompact
- State of nodes, compact form.
-
- StateLong
- State of nodes, extended form.
-
- StateComplete
- State of nodes, including all node state flags. eg.
"idle+cloud+power"
-
- Threads
- Number of threads per core.
-
- Time
- Maximum time for any job in the format
"days-hours:minutes:seconds".
-
- TimeStamp
- Print the timestamp of the reason a node is
unavailable.
-
- User
- Print the user name of who set the reason a node is
unavailable.
-
- UserLong
- Print the user name and uid of who set the reason a node is
unavailable.
-
- Version
- Print the version of the running slurmd daemon.
-
- Weight
- Scheduling weight of the nodes.
-
- --help
- Print a message describing all sinfo options.
-
- --hide
- Do not display information about hidden partitions.
Partitions that are configured as hidden or are not available to the
user's group will not be displayed. This is the default behavior.
-
-
-i, --iterate=<seconds>
- Print the state on a periodic basis. Sleep for the
indicated number of seconds between reports. By default prints a time
stamp with the header.
-
- --json
- Dump node information as JSON. All other formatting and
filtering arguments will be ignored.
-
-
-R, --list-reasons
- List reasons nodes are in the down, drained, fail or
failing state. When nodes are in these states Slurm supports the inclusion
of a "reason" string by an administrator. This option will
display the first 20 characters of the reason field and list of nodes with
that reason for all nodes that are, by default, down, drained, draining or
failing. This option may be used with other node filtering options (e.g.
-r, -d, -t, -n), however, combinations of
these options that result in a list of nodes that are not down or drained
or failing will not produce any output. When used with -l the
output additionally includes the current node state.
-
- --local
- Show only jobs local to this cluster. Ignore other clusters
in this federation (if any). Overrides --federation.
-
-
-l, --long
- Print more detailed information. This is ignored if the
--format option is specified.
-
- --noconvert
- Don't convert units from their original type (e.g. 2048M
won't be converted to 2G).
-
-
-N, --Node
- Print information in a node-oriented format with one line
per node and partition. That is, if a node belongs to more than one
partition, then one line for each node-partition pair will be shown. If
--partition is also specified, then only one line per node in this
partition is shown. The default is to print information in a
partition-oriented format. This is ignored if the --format option
is specified.
-
-
-n, --nodes=<nodes>
- Print information about the specified node(s). Multiple
nodes may be comma separated or expressed using a node range expression
(e.g. "linux[00-17]") Limiting the query to just the relevant
nodes can measurably improve the performance of the command for large
clusters.
-
-
-h, --noheader
- Do not print a header on the output.
-
-
-p, --partition=<partition>
- Print information only about the specified partition(s).
Multiple partitions are separated by commas.
-
-
-T, --reservation
- Only display information about Slurm reservations.
NOTE: This option causes sinfo to ignore most other options,
which are focused on partition and node information.
-
-
-r, --responding
- If set only report state information for responding
nodes.
-
-
-S, --sort=<sort_list>
- Specification of the order in which records should be
reported. This uses the same field specification as the
<output_format>. Multiple sorts may be performed by listing multiple
sort fields separated by commas. The field specifications may be preceded
by "+" or "-" for ascending (default) and descending
order respectively. The partition field specification, "P", may
be preceded by a "#" to report partitions in the same order that
they appear in Slurm's configuration file, slurm.conf. For example,
a sort value of "+P,-m" requests that records be printed in
order of increasing partition name and within a partition by decreasing
memory size. The default value of sort is "#P,-t" (partitions
ordered as configured then decreasing node state). If the --Node
option is selected, the default sort value is "N" (increasing
node name).
-
-
-t, --states=<states>
- List nodes only having the given state(s). Multiple states
may be comma separated and the comparison is case insensitive. If the
states are separated by '&', then the nodes must be in all states.
Possible values include (case insensitive): ALLOC, ALLOCATED, CLOUD, COMP,
COMPLETING, DOWN, DRAIN (for node in DRAINING or DRAINED states), DRAINED,
DRAINING, FAIL, FUTURE, FUTR, IDLE, MAINT, MIX, MIXED, NO_RESPOND, NPC,
PERFCTRS, PLANNED, POWER_DOWN, POWERING_DOWN, POWERED_DOWN, POWERING_UP,
REBOOT_ISSUED, REBOOT_REQUESTED, RESV, RESERVED, UNK, and UNKNOWN. By
default nodes in the specified state are reported whether they are
responding or not. The --dead and --responding options may
be used to filter nodes by the corresponding flag.
-
-
-s, --summarize
- List only a partition state summary with no node state
details. This is ignored if the --format option is specified.
-
- --usage
- Print a brief message listing the sinfo
options.
-
-
-v, --verbose
- Provide detailed event logging through program
execution.
-
-
-V, --version
- Print version information and exit.
-
- --yaml
- Dump node information as YAML. All other formatting and
filtering arguments will be ignored.
-
- AVAIL
- Partition state. Can be either up, down,
drain, or inact (for INACTIVE). See the partition
definition's State parameter in the slurm.conf(5) man page
for more information.
-
- CPUS
- Count of CPUs (processors) on these nodes.
-
- S:C:T
- Count of sockets (S), cores (C), and threads (T) on these
nodes.
-
- SOCKETS
- Count of sockets on these nodes.
-
- CORES
- Count of cores on these nodes.
-
- THREADS
- Count of threads on these nodes.
-
- GROUPS
- Resource allocations in this partition are restricted to
the named groups. all indicates that all groups may use this
partition.
-
- JOB_SIZE
- Minimum and maximum node count that can be allocated to any
user job. A single number indicates the minimum and maximum node count are
the same. infinite is used to identify partitions without a maximum
node count.
-
- TIMELIMIT
- Maximum time limit for any user job in
days-hours:minutes:seconds. infinite is used to identify partitions
without a job time limit.
-
- MEMORY
- Size of real memory in megabytes on these nodes.
-
- NODELIST
- Names of nodes associated with this particular
configuration.
-
- NODES
- Count of nodes with this particular configuration.
-
- NODES(A/I)
- Count of nodes with this particular configuration by node
state in the form "allocated/idle".
-
- NODES(A/I/O/T)
- Count of nodes with this particular configuration by node
state in the form "allocated/idle/other/total".
-
- PARTITION
- Name of a partition. Note that the suffix "*"
identifies the default partition.
-
- PORT
- Local TCP port used by slurmd on the node.
-
- ROOT
- Is the ability to allocate resources in this partition
restricted to user root, yes or no.
-
- OVERSUBSCRIBE
- Whether jobs allocated resources in this partition can/will
oversubscribe those compute resources (e.g. CPUs). NO indicates
resources are never oversubscribed. EXCLUSIVE indicates whole nodes
are dedicated to jobs (equivalent to srun --exclusive option, may be used
even with select/cons_res managing individual processors). FORCE
indicates resources are always available to be oversubscribed. YES
indicates resource may be oversubscribed, if requested by the job's
resource allocation.
NOTE: If OverSubscribe is set to FORCE or YES, the OversubScribe
value will be appended to the output.
-
- STATE
- State of the nodes. Possible states include: allocated,
completing, down, drained, draining, fail, failing, future, idle, maint,
mixed, perfctrs, planned, power_down, power_up, reserved, and unknown.
Their abbreviated forms are: alloc, comp, down, drain, drng, fail, failg,
futr, idle, maint, mix, npc, plnd, pow_dn, pow_up, resv, and unk
respectively.
NOTE: The suffix "*" identifies nodes that are presently
not responding.
-
- TMP_DISK
- Size of temporary disk space in megabytes on these
nodes.
-
Node state codes are shortened as required for the field size. These node states
may be followed by a special character to identify state flags associated with
the node. The following node suffixes and states are used:
- *
- The node is presently not responding and will not be
allocated any new work. If the node remains non-responsive, it will be
placed in the DOWN state (except in the case of COMPLETING,
DRAINED, DRAINING, FAIL, FAILING nodes).
-
- ~
- The node is presently in powered off.
-
- #
- The node is presently being powered up or configured.
-
- !
- The node is pending power down.
-
- %
- The node is presently being powered down.
-
- $
- The node is currently in a reservation with a flag value of
"maintenance".
-
- @
- The node is pending reboot.
-
- ^
- The node reboot was issued.
-
- -
- The node is planned by the backfill scheduler for a higher
priority job.
-
- ALLOCATED
- The node has been allocated to one or more jobs.
-
- ALLOCATED+
- The node is allocated to one or more active jobs plus one
or more jobs are in the process of COMPLETING.
-
- COMPLETING
- All jobs associated with this node are in the process of
COMPLETING. This node state will be removed when all of the job's
processes have terminated and the Slurm epilog program (if any) has
terminated. See the Epilog parameter description in the
slurm.conf(5) man page for more information.
-
- DOWN
- The node is unavailable for use. Slurm can automatically
place nodes in this state if some failure occurs. System administrators
may also explicitly place nodes in this state. If a node resumes normal
operation, Slurm can automatically return it to service. See the
ReturnToService and SlurmdTimeout parameter descriptions in
the slurm.conf(5) man page for more information.
-
- DRAINED
- The node is unavailable for use per system administrator
request. See the update node command in the scontrol(1) man
page or the slurm.conf(5) man page for more information.
-
- DRAINING
- The node is currently executing a job, but will not be
allocated additional jobs. The node state will be changed to state
DRAINED when the last job on it completes. Nodes enter this state
per system administrator request. See the update node
command in the scontrol(1) man page or the slurm.conf(5) man
page for more information.
-
- FAIL
- The node is expected to fail soon and is unavailable for
use per system administrator request. See the update node command
in the scontrol(1) man page or the slurm.conf(5) man page
for more information.
-
- FAILING
- The node is currently executing a job, but is expected to
fail soon and is unavailable for use per system administrator request. See
the update node command in the scontrol(1) man page or the
slurm.conf(5) man page for more information.
-
- FUTURE
- The node is currently not fully configured, but expected to
be available at some point in the indefinite future for use.
-
- IDLE
- The node is not allocated to any jobs and is available for
use.
-
- INVAL
- The node did not register correctly with the controller.
This happens when a node registers with less resources than configured in
the slurm.conf file. The node will clear from this state with a valid
registration (i.e. a slurmd restart is required).
-
- MAINT
- The node is currently in a reservation with a flag value of
"maintenance".
-
- REBOOT_ISSUED
- A reboot request has been sent to the agent configured to
handle this request.
-
- REBOOT_REQUESTED
- A request to reboot this node has been made, but hasn't
been handled yet.
-
- MIXED
- The node has some of its CPUs ALLOCATED while others
are IDLE.
-
- PERFCTRS (NPC)
- Network Performance Counters associated with this node are
in use, rendering this node as not usable for any other jobs
-
- PLANNED
- The node is planned by the backfill scheduler for a higher
priority job.
-
- POWER_DOWN
- The node is pending power down.
-
- POWERED_DOWN
- The node is currently powered down and not capable of
running any jobs.
-
- POWERING_DOWN
- The node is in the process of powering down and not capable
of running any jobs.
-
- POWERING_UP
- The node is in the process of being powered up.
-
- RESERVED
- The node is in an advanced reservation and not generally
available.
-
- UNKNOWN
- The Slurm controller has just started and the node's state
has not yet been determined.
-
Executing
sinfo sends a remote procedure call to
slurmctld. If
enough calls from
sinfo or other Slurm client commands that send remote
procedure calls to the
slurmctld daemon come in at once, it can result
in a degradation of performance of the
slurmctld daemon, possibly
resulting in a denial of service.
Do not run
sinfo or other Slurm client commands that send remote
procedure calls to
slurmctld from loops in shell scripts or other
programs. Ensure that programs limit calls to
sinfo to the minimum
necessary for the information you are trying to gather.
Some
sinfo options may be set via environment variables. These
environment variables, along with their corresponding options, are listed
below.
NOTE: Command line options will always override these settings.
- SINFO_ALL
- Same as -a, --all
-
- SINFO_FEDERATION
- Same as --federation
-
- SINFO_FORMAT
- Same as -o <output_format>,
--format=<output_format>
-
- SINFO_LOCAL
- Same as --local
-
- SINFO_PARTITION
- Same as -p <partition>,
--partition=<partition>
-
- SINFO_SORT
- Same as -S <sort>, --sort=<sort>
-
- SLURM_CLUSTERS
- Same as --clusters
-
- SLURM_CONF
- The location of the Slurm configuration file.
-
- SLURM_DEBUG_FLAGS
- Specify debug flags for sinfo to use. See DebugFlags in the
slurm.conf(5) man page for a full list of flags. The environment
variable takes precedence over the setting in the slurm.conf.
-
- SLURM_TIME_FORMAT
- Specify the format used to report time stamps. A value of
standard, the default value, generates output in the form
"year-month-dateThour:minute:second". A value of relative
returns only "hour:minute:second" if the current day. For other
dates in the current year it prints the "hour:minute" preceded
by "Tomorr" (tomorrow), "Ystday" (yesterday), the name
of the day for the coming week (e.g. "Mon", "Tue",
etc.), otherwise the date (e.g. "25 Apr"). For other years it
returns a date month and year without a time (e.g. "6 Jun
2012"). All of the time stamps use a 24 hour format.
A valid strftime() format can also be specified. For example, a value of
"%a %T" will report the day of the week and a time stamp (e.g.
"Mon 12:34:56").
-
- Report basic node and partition configurations:
-
$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
batch up infinite 2 alloc adev[8-9]
batch up infinite 6 idle adev[10-15]
debug* up 30:00 8 idle adev[0-7]
- Report partition summary information:
-
$ sinfo -s
PARTITION AVAIL TIMELIMIT NODES(A/I/O/T) NODELIST
batch up infinite 2/6/0/8 adev[8-15]
debug* up 30:00 0/8/0/8 adev[0-7]
- Report more complete information about the partition
debug:
-
$ sinfo --long --partition=debug
PARTITION AVAIL TIMELIMIT JOB_SIZE ROOT OVERSUBS GROUPS NODES STATE NODELIST
debug* up 30:00 8 no no all 8 idle dev[0-7]
- Report only those nodes that are in state DRAINED:
-
$ sinfo --states=drained
PARTITION AVAIL NODES TIMELIMIT STATE NODELIST
debug* up 2 30:00 drain adev[6-7]
- Report node-oriented information with details and exact
matches:
-
$ sinfo -Nel
NODELIST NODES PARTITION STATE CPUS MEMORY TMP_DISK WEIGHT FEATURES REASON
adev[0-1] 2 debug* idle 2 3448 38536 16 (null) (null)
adev[2,4-7] 5 debug* idle 2 3384 38536 16 (null) (null)
adev3 1 debug* idle 2 3394 38536 16 (null) (null)
adev[8-9] 2 batch allocated 2 246 82306 16 (null) (null)
adev[10-15] 6 batch idle 2 246 82306 16 (null) (null)
- Report only down, drained and draining nodes and their
reason field:
-
$ sinfo -R
REASON NODELIST
Memory errors dev[0,5]
Not Responding dev8
Copyright (C) 2002-2007 The Regents of the University of California. Produced at
Lawrence Livermore National Laboratory (cf, DISCLAIMER).
Copyright (C) 2008-2009 Lawrence Livermore National Security.
Copyright (C) 2010-2022 SchedMD LLC.
This file is part of Slurm, a resource management program. For details, see
<
https://slurm.schedmd.com/>.
Slurm is free software; you can redistribute it and/or modify it under the terms
of the GNU General Public License as published by the Free Software
Foundation; either version 2 of the License, or (at your option) any later
version.
Slurm is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
A PARTICULAR PURPOSE. See the GNU General Public License for more details.
scontrol(1),
squeue(1),
slurm_load_ctl_conf (3),
slurm_load_jobs (3),
slurm_load_node (3),
slurm_load_partitions (3),
slurm_reconfigure (3),
slurm_shutdown (3),
slurm_update_job (3),
slurm_update_node (3),
slurm_update_partition (3),
slurm.conf(5)