struct nvme_smart_log - SMART / Health Information Log (Log Identifier 02h)
struct nvme_smart_log {
__u8 critical_warning;
__u8 temperature[2];
__u8 avail_spare;
__u8 spare_thresh;
__u8 percent_used;
__u8 endu_grp_crit_warn_sumry;
__u8 rsvd7[25];
__u8 data_units_read[16];
__u8 data_units_written[16];
__u8 host_reads[16];
__u8 host_writes[16];
__u8 ctrl_busy_time[16];
__u8 power_cycles[16];
__u8 power_on_hours[16];
__u8 unsafe_shutdowns[16];
__u8 media_errors[16];
__u8 num_err_log_entries[16];
__le32 warning_temp_time;
__le32 critical_comp_time;
__le16 temp_sensor[8];
__le32 thm_temp1_trans_count;
__le32 thm_temp2_trans_count;
__le32 thm_temp1_total_time;
__le32 thm_temp2_total_time;
__u8 rsvd232[280];
};
- critical_warning
- This field indicates critical warnings for the state of the
controller. Critical warnings may result in an asynchronous event
notification to the host. Bits in this field represent the current
associated state and are not persistent (see enum
nvme_smart_crit).
- temperature
- Composite Temperature: Contains a value corresponding to a
temperature in Kelvins that represents the current composite temperature
of the controller and namespace(s) associated with that controller. The
manner in which this value is computed is implementation specific and may
not represent the actual temperature of any physical point in the NVM
subsystem. Warning and critical overheating composite temperature
threshold values are reported by the WCTEMP and CCTEMP fields in the
Identify Controller data structure.
- avail_spare
- Available Spare: Contains a normalized percentage (0% to
100%) of the remaining spare capacity available.
- spare_thresh
- Available Spare Threshold: When the Available Spare falls
below the threshold indicated in this field, an asynchronous event
completion may occur. The value is indicated as a normalized percentage
(0% to 100%). The values 101 to 255 are reserved.
- percent_used
- Percentage Used: Contains a vendor specific estimate of the
percentage of NVM subsystem life used based on the actual usage and the
manufacturer's prediction of NVM life. A value of 100 indicates that the
estimated endurance of the NVM in the NVM subsystem has been consumed, but
may not indicate an NVM subsystem failure. The value is allowed to exceed
100. Percentages greater than 254 shall be represented as 255. This value
shall be updated once per power-on hour (when the controller is not in a
sleep state).
- endu_grp_crit_warn_sumry
- Endurance Group Critical Warning Summary: This field
indicates critical warnings for the state of Endurance Groups. Bits in
this field represent the current associated state and are not persistent
(see enum nvme_smart_egcw).
- rsvd7
- Reserved
- data_units_read
- Data Units Read: Contains the number of 512 byte data units
the host has read from the controller; this value does not include
metadata. This value is reported in thousands (i.e., a value of 1
corresponds to 1000 units of 512 bytes read) and is rounded up (e.g., one
indicates the that number of 512 byte data units read is from 1 to 1000,
three indicates that the number of 512 byte data units read is from 2001
to 3000). When the LBA size is a value other than 512 bytes, the
controller shall convert the amount of data read to 512 byte units. For
the NVM command set, logical blocks read as part of Compare, Read, and
Verify operations shall be included in this value. A value of 0h in this
field indicates that the number of Data Units Read is not reported.
- data_units_written
- Data Units Written: Contains the number of 512 byte data
units the host has written to the controller; this value does not include
metadata. This value is reported in thousands (i.e., a value of 1
corresponds to 1000 units of 512 bytes written) and is rounded up (e.g.,
one indicates that the number of 512 byte data units written is from 1 to
1,000, three indicates that the number of 512 byte data units written is
from 2001 to 3000). When the LBA size is a value other than 512 bytes, the
controller shall convert the amount of data written to 512 byte units. For
the NVM command set, logical blocks written as part of Write operations
shall be included in this value. Write Uncorrectable commands and Write
Zeroes commands shall not impact this value. A value of 0h in this field
indicates that the number of Data Units Written is not reported.
- host_reads
- Host Read Commands: Contains the number of read commands
completed by the controller. For the NVM command set, this value is the
sum of the number of Compare commands and the number of Read
commands.
- host_writes
- Host Write Commands: Contains the number of write commands
completed by the controller. For the NVM command set, this is the number
of Write commands.
- ctrl_busy_time
- Controller Busy Time: Contains the amount of time the
controller is busy with I/O commands. The controller is busy when there is
a command outstanding to an I/O Queue (specifically, a command was issued
via an I/O Submission Queue Tail doorbell write and the corresponding
completion queue entry has not been posted yet to the associated I/O
Completion Queue). This value is reported in minutes.
- power_cycles
- Power Cycles: Contains the number of power cycles.
- power_on_hours
- Power On Hours: Contains the number of power-on hours. This
may not include time that the controller was powered and in a
non-operational power state.
- unsafe_shutdowns
- Unsafe Shutdowns: Contains the number of unsafe shutdowns.
This count is incremented when a Shutdown Notification (CC.SHN) is not
received prior to loss of power.
- media_errors
- Media and Data Integrity Errors: Contains the number of
occurrences where the controller detected an unrecovered data integrity
error. Errors such as uncorrectable ECC, CRC checksum failure, or LBA tag
mismatch are included in this field. Errors introduced as a result of a
Write Uncorrectable command may or may not be included in this field.
- num_err_log_entries
- Number of Error Information Log Entries: Contains the
number of Error Information log entries over the life of the
controller.
- warning_temp_time
- Warning Composite Temperature Time: Contains the amount of
time in minutes that the controller is operational and the Composite
Temperature is greater than or equal to the Warning Composite Temperature
Threshold (WCTEMP) field and less than the Critical Composite Temperature
Threshold (CCTEMP) field in the Identify Controller data structure. If the
value of the WCTEMP or CCTEMP field is 0h, then this field is always
cleared to 0h regardless of the Composite Temperature value.
- critical_comp_time
- Critical Composite Temperature Time: Contains the amount of
time in minutes that the controller is operational and the Composite
Temperature is greater than or equal to the Critical Composite Temperature
Threshold (CCTEMP) field in the Identify Controller data structure. If the
value of the CCTEMP field is 0h, then this field is always cleared to 0h
regardless of the Composite Temperature value.
- temp_sensor
- Temperature Sensor 1-8: Contains the current temperature in
degrees Kelvin reported by temperature sensors 1-8. The physical point in
the NVM subsystem whose temperature is reported by the temperature sensor
and the temperature accuracy is implementation specific. An implementation
that does not implement the temperature sensor reports a value of 0h.
- thm_temp1_trans_count
- Thermal Management Temperature 1 Transition Count: Contains
the number of times the controller transitioned to lower power active
power states or performed vendor specific thermal management actions while
minimizing the impact on performance in order to attempt to reduce the
Composite Temperature because of the host controlled thermal management
feature (i.e., the Composite Temperature rose above the Thermal Management
Temperature 1). This counter shall not wrap once the value FFFFFFFFh is
reached. A value of 0h, indicates that this transition has never occurred
or this field is not implemented.
- thm_temp2_trans_count
- Thermal Management Temperature 2 Transition Count
- thm_temp1_total_time
- Total Time For Thermal Management Temperature 1: Contains
the number of seconds that the controller had transitioned to lower power
active power states or performed vendor specific thermal management
actions while minimizing the impact on performance in order to attempt to
reduce the Composite Temperature because of the host controlled thermal
management feature. This counter shall not wrap once the value FFFFFFFFh
is reached. A value of 0h, indicates that this transition has never
occurred or this field is not implemented.
- thm_temp2_total_time
- Total Time For Thermal Management Temperature 2
- rsvd232
- Reserved