smart
About
smartd is a daemon that monitors the Self-Monitoring, Analysis and Reporting Technology (SMART) system built into most ATA/SATA and SCSI/SAS hard drives and solid-state drives. The purpose of SMART is to monitor the reliability of the hard drive and predict drive failures, and to carry out different types of drive self-tests. This version of smartd is compatible with ACS-3, ACS-2, ATA8-ACS, ATA/ATAPI-7 and earlier standards.
tools
smartctl
CLI tool smartctl to investigate disks.
One shot on all devices
GSmartControl
Hard disk drive and SSD health inspection tool
GSmartControl is a graphical user interface for smartctl (from smartmontools package), which is a tool for querying and controlling SMART (Self-Monitoring, Analysis, and Reporting Technology) data on modern hard disk and solid-state drives. It allows you to inspect the drive's SMART data to determine its health, as well as run various tests on it.
sudo gsmartcontrol
Notes
- Monitored devices:
Examine all entries /dev/hd[a-t] for IDE/ATA devices, and /dev/sd[a-z], /dev/sd[a-c][a-z] for ATA/SATA or SCSI/SAS devices.
- Disks behind RAID controllers are not included.
If directive -d nvme or no -d directive is specified, examine all entries /dev/nvme[0-99] for NVMe devices.
- Entries reporting failure of SMART Prefailure Attributes should not be ignored: they mean that the disk is failing.
- Use the smartctl utility to investigate.
Line continuation with \
Installation
1 aptitude install smartmontools mailutils
Test mail
The defaults are probably alright, but you have to test your mail delivery.
If you are using /etc/hosts to set FQDN, make sure the hostname is set correctly (with no trailing dots)
Make sure your aliases are setup correctly
/etc/aliases
1 root: root@rockstable.it
Translate the aliases
1 newaliases
Run a quick check
Configure
The default config will:
- scan for removable devices,
- check the device unless it is in SLEEP or STANDBY mode,
- and on failure send and email to "root"
- check if mailx or mailutils are installed
/etc/smartd.conf
1 # Sample configuration file for smartd. See man smartd.conf.
2
3 # Home page is: http://www.smartmontools.org
4
5 # smartd will re-read the configuration file if it receives a HUP
6 # signal
7
8 # The file gives a list of devices to monitor using smartd, with one
9 # device per line. Text after a hash (#) is ignored, and you may use
10 # spaces and tabs for white space. You may use '\' to continue lines.
11
12 # You can usually identify which hard disks are on your system by
13 # looking in /proc/ide and in /proc/scsi.
14
15 # The word DEVICESCAN will cause any remaining lines in this
16 # configuration file to be ignored: it tells smartd to scan for all
17 # ATA and SCSI devices. DEVICESCAN may be followed by any of the
18 # Directives listed below, which will be applied to all devices that
19 # are found. Most users should comment out DEVICESCAN and explicitly
20 # list the devices that they wish to monitor.
21 DEVICESCAN -d removable -n standby -m root -M exec /usr/share/smartmontools/smartd-runner
22
23 # Alternative setting to ignore temperature and power-on hours reports
24 # in syslog.
25 #DEVICESCAN -I 194 -I 231 -I 9
26
27 # Alternative setting to report more useful raw temperature in syslog.
28 #DEVICESCAN -R 194 -R 231 -I 9
29
30 # Alternative setting to report raw temperature changes >= 5 Celsius
31 # and min/max temperatures.
32 #DEVICESCAN -I 194 -I 231 -I 9 -W 5
33
34 # First ATA/SATA or SCSI/SAS disk. Monitor all attributes, enable
35 # automatic online data collection, automatic Attribute autosave, and
36 # start a short self-test every day between 2-3am, and a long self test
37 # Saturdays between 3-4am.
38 #/dev/sda -a -o on -S on -s (S/../.././02|L/../../6/03)
39
40 # Monitor SMART status, ATA Error Log, Self-test log, and track
41 # changes in all attributes except for attribute 194
42 #/dev/sdb -H -l error -l selftest -t -I 194
43
44 # Monitor all attributes except normalized Temperature (usually 194),
45 # but track Temperature changes >= 4 Celsius, report Temperatures
46 # >= 45 Celsius and changes in Raw value of Reallocated_Sector_Ct (5).
47 # Send mail on SMART failures or when Temperature is >= 55 Celsius.
48 #/dev/sdc -a -I 194 -W 4,45,55 -R 5 -m admin@example.com
49
50 # An ATA disk may appear as a SCSI device to the OS. If a SCSI to
51 # ATA Translation (SAT) layer is between the OS and the device then
52 # this can be flagged with the '-d sat' option. This situation may
53 # become common with SATA disks in SAS and FC environments.
54 # /dev/sda -a -d sat
55
56 # A very silent check. Only report SMART health status if it fails
57 # But send an email in this case
58 #/dev/sdc -H -C 0 -U 0 -m admin@example.com
59
60 # First two SCSI disks. This will monitor everything that smartd can
61 # monitor. Start extended self-tests Wednesdays between 6-7pm and
62 # Sundays between 1-2 am
63 #/dev/sda -d scsi -s L/../../3/18
64 #/dev/sdb -d scsi -s L/../../7/01
65
66 # Monitor 4 ATA disks connected to a 3ware 6/7/8000 controller which uses
67 # the 3w-xxxx driver. Start long self-tests Sundays between 1-2, 2-3, 3-4,
68 # and 4-5 am.
69 # NOTE: starting with the Linux 2.6 kernel series, the /dev/sdX interface
70 # is DEPRECATED. Use the /dev/tweN character device interface instead.
71 # For example /dev/twe0, /dev/twe1, and so on.
72 #/dev/sdc -d 3ware,0 -a -s L/../../7/01
73 #/dev/sdc -d 3ware,1 -a -s L/../../7/02
74 #/dev/sdc -d 3ware,2 -a -s L/../../7/03
75 #/dev/sdc -d 3ware,3 -a -s L/../../7/04
76
77 # Monitor 2 ATA disks connected to a 3ware 9000 controller which
78 # uses the 3w-9xxx driver (Linux, FreeBSD). Start long self-tests Tuesdays
79 # between 1-2 and 3-4 am.
80 #/dev/twa0 -d 3ware,0 -a -s L/../../2/01
81 #/dev/twa0 -d 3ware,1 -a -s L/../../2/03
82
83 # Monitor 2 SATA (not SAS) disks connected to a 3ware 9000 controller which
84 # uses the 3w-sas driver (Linux). Start long self-tests Tuesdays
85 # between 1-2 and 3-4 am.
86 # On FreeBSD /dev/tws0 should be used instead
87 #/dev/twl0 -d 3ware,0 -a -s L/../../2/01
88 #/dev/twl0 -d 3ware,1 -a -s L/../../2/03
89
90 # Same as above for Windows. Option '-d 3ware,N' is not necessary,
91 # disk (port) number is specified in device name.
92 # NOTE: On Windows, DEVICESCAN works also for 3ware controllers.
93 #/dev/hdc,0 -a -s L/../../2/01
94 #/dev/hdc,1 -a -s L/../../2/03
95 #
96 # Monitor 2 disks connected to the first HP SmartArray controller which
97 # uses the cciss driver. Start long tests on Sunday nights and short
98 # self-tests every night and send errors to root
99 #/dev/sda -d cciss,0 -a -s (L/../../7/02|S/../.././02) -m root
100 #/dev/sda -d cciss,1 -a -s (L/../../7/03|S/../.././03) -m root
101
102 # Monitor 3 ATA disks directly connected to a HighPoint RocketRAID. Start long
103 # self-tests Sundays between 1-2, 2-3, and 3-4 am.
104 #/dev/sdd -d hpt,1/1 -a -s L/../../7/01
105 #/dev/sdd -d hpt,1/2 -a -s L/../../7/02
106 #/dev/sdd -d hpt,1/3 -a -s L/../../7/03
107
108 # Monitor 2 ATA disks connected to the same PMPort which connected to the
109 # HighPoint RocketRAID. Start long self-tests Tuesdays between 1-2 and 3-4 am
110 #/dev/sdd -d hpt,1/4/1 -a -s L/../../2/01
111 #/dev/sdd -d hpt,1/4/2 -a -s L/../../2/03
112
113 # HERE IS A LIST OF DIRECTIVES FOR THIS CONFIGURATION FILE.
114 # PLEASE SEE THE smartd.conf MAN PAGE FOR DETAILS
115 #
116 # -d TYPE Set the device type: ata, scsi, marvell, removable, 3ware,N, hpt,L/M/N
117 # -T TYPE set the tolerance to one of: normal, permissive
118 # -o VAL Enable/disable automatic offline tests (on/off)
119 # -S VAL Enable/disable attribute autosave (on/off)
120 # -n MODE No check. MODE is one of: never, sleep, standby, idle
121 # -H Monitor SMART Health Status, report if failed
122 # -l TYPE Monitor SMART log. Type is one of: error, selftest
123 # -f Monitor for failure of any 'Usage' Attributes
124 # -m ADD Send warning email to ADD for -H, -l error, -l selftest, and -f
125 # -M TYPE Modify email warning behavior (see man page)
126 # -s REGE Start self-test when type/date matches regular expression (see man page)
127 # -p Report changes in 'Prefailure' Normalized Attributes
128 # -u Report changes in 'Usage' Normalized Attributes
129 # -t Equivalent to -p and -u Directives
130 # -r ID Also report Raw values of Attribute ID with -p, -u or -t
131 # -R ID Track changes in Attribute ID Raw value with -p, -u or -t
132 # -i ID Ignore Attribute ID for -f Directive
133 # -I ID Ignore Attribute ID for -p, -u or -t Directive
134 # -C ID Report if Current Pending Sector count non-zero
135 # -U ID Report if Offline Uncorrectable count non-zero
136 # -W D,I,C Monitor Temperature D)ifference, I)nformal limit, C)ritical limit
137 # -v N,ST Modifies labeling of Attribute N (see man page)
138 # -a Default: equivalent to -H -f -t -l error -l selftest -C 197 -U 198
139 # -F TYPE Use firmware bug workaround. Type is one of: none, samsung
140 # -P TYPE Drive-specific presets: use, ignore, show, showall
141 # # Comment: text after a hash sign is ignored
142 # \ Line continuation character
143 # Attribute ID is a decimal integer 1 <= ID <= 255
144 # except for -C and -U, where ID = 0 turns them off.
145 # All but -d, -m and -M Directives are only implemented for ATA devices
146 #
147 # If the test string DEVICESCAN is the first uncommented text
148 # then smartd will scan for devices.
149 # DEVICESCAN may be followed by any desired Directives.
150