Haz tu propia nube - Comprobar el disco duro

Probando el disco duro

Como el disco duro que voy a usar es un poco viejuno, lo primero que he hecho es montarlo en mi portátil y pasarle un par de pruebas. Yo tengo instalada (de momento) una opensuse Leap 42.2. El portátil tiene su disco duro interno (/dev/sda), y cuando le enchufe el disco externo, el sistema operativo lo va a reconocer como /dev/sdb. Tened en cuenta que esto puede ser diferente en vuestra distribución y/o si vuestro ordenador tiene más discos. Los resultados que muestro aquí son orientativos, no va a dar igual en vuestra máquina. Sed pacientes y cuidadosos, y recordad que la red está llena de información. Y no enchufeis el disco externo USB, antes vamos a hacer otras pruebas.


Vamos a usar algunos programas (utilidades, herramientas del sistema), y como a mi me gusta más la linea de comandos que la interfaz gráfica, todos los comandos los voy a ejecutar desde una consola de bash del administrador (root). Comprobad que tenéis el cinturón de seguridad puesto y abrid una consola de root

Vamos a comprobar que tenemos instaladas las smartmontools (www.smartmontools.org). Si no es así, usad yast o zypp.

earth:~ # rpm -qa | grep smartmontools
smartmontools-6.5-121.1.x86_64

Para poder comprobar los discos, lo primero es actualizar la base de datos de las smartmontools

earth:~ # /usr/sbin/update-smart-drivedb
/usr/share/smartmontools/drivedb.h updated from branches/RELEASE_6_5_DRIVEDB

Y lo primero, antes de enchufar ningún disco externo, es echarle un vistazo a los dispositivos que el sistema operativo reconoce como discos

earth:~ # ls -ls /dev/sd*
0 brw-rw---- 1 root disk 8, 0 Aug 23 10:42 /dev/sda
0 brw-rw---- 1 root disk 8, 1 Aug 23 10:42 /dev/sda1
0 brw-rw---- 1 root disk 8, 10 Aug 23 10:42 /dev/sda10
0 brw-rw---- 1 root disk 8, 2 Aug 23 10:42 /dev/sda2
0 brw-rw---- 1 root disk 8, 3 Aug 23 10:42 /dev/sda3
0 brw-rw---- 1 root disk 8, 5 Aug 23 10:42 /dev/sda5
0 brw-rw---- 1 root disk 8, 6 Aug 23 10:42 /dev/sda6
0 brw-rw---- 1 root disk 8, 7 Aug 23 10:42 /dev/sda7
0 brw-rw---- 1 root disk 8, 8 Aug 23 10:42 /dev/sda8
0 brw-rw---- 1 root disk 8, 9 Aug 23 10:42 /dev/sda9

Entonces enchufamos el disco externo y volvemos a repetir el comando

earth:~ # ls -ls /dev/sd*
0 brw-rw---- 1 root disk 8, 0 Aug 23 10:42 /dev/sda
0 brw-rw---- 1 root disk 8, 1 Aug 23 10:42 /dev/sda1
0 brw-rw---- 1 root disk 8, 10 Aug 23 10:42 /dev/sda10
0 brw-rw---- 1 root disk 8, 2 Aug 23 10:42 /dev/sda2
0 brw-rw---- 1 root disk 8, 3 Aug 23 10:42 /dev/sda3
0 brw-rw---- 1 root disk 8, 5 Aug 23 10:42 /dev/sda5
0 brw-rw---- 1 root disk 8, 6 Aug 23 10:42 /dev/sda6
0 brw-rw---- 1 root disk 8, 7 Aug 23 10:42 /dev/sda7
0 brw-rw---- 1 root disk 8, 8 Aug 23 10:42 /dev/sda8
0 brw-rw---- 1 root disk 8, 9 Aug 23 10:42 /dev/sda9
0 brw-rw---- 1 root disk 8, 16 Aug 23 11:10 /dev/sdb

El siguiente paso es descubrir que tipo de disco tenemos dentro, y cuan saludable está.

earth:~ # smartctl -i /dev/sdb
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.79-18.26-default] (SUSE RPM)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:   Seagate Barracuda 7200.10
Device Model:   ST3320820A
Serial Number:  12345678
Firmware Version: 3.AAD
User Capacity: 320,072,933,376 bytes [320 GB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA/ATAPI-7 (minor revision not indicated)
Local Time is: Wed Aug 23 17:55:13 2017 CEST
SMART support is: Available - device has SMART capability
SMART support is: Enabled

De lo que vemos que dentro tiene un disco Seagate. Y como es uno bastante bueno, el soporte para SMART está habilitado y disponible. Para ver el informe del "health status", ejecutamos el siguiente comando

earth:~ # smartctl -H /dev/sdb
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.79-18.26-default] (SUSE RPM)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

Breve e informativo, pero mejor seguir viendo las capacidades del disco

earth:~ # smartctl -c /dev/sdb
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.79-18.26-default] (SUSE RPM)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (  430) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 115) minutes.

Hay dos tipos de pruebas que pueden ejecutarse en el disco. Empecemos con una corta

earth:~ # smartctl -t short /dev/sdb
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.79-18.26-default] (SUSE RPM)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 1 minutes for test to complete.
Test will complete after Wed Aug 23 10:10:10 2017

Use smartctl -X to abort test.

Esperamos algo más de un minuto y entonces

earth:~ # smartctl -l selftest /dev/sdb
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.79-18.26-default] (SUSE RPM)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description  Status                  Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline     Completed without error 00%       2086            -
# 2 Short offline     Completed without error 00%       1414            -
# 3 Short offline     Completed without error 00%       1394            -

Este disco ha estado en funcionamiento por 2086 horas. Es muy poco tiempo, comparado con unos cuantos discos que he visto trabajar más de 50000 horas (que son más de seis años funcionando seguidos). Así que vamos a comprobar cómo se comporta en una prueba larga

earth:~ # smartctl -t long /dev/sdb
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.79-18.26-default] (SUSE RPM)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 115 minutes for test to complete.
Test will complete after Wed Aug 23 14:14:14 2017

Use smartctl -X to abort test.

A esto le va a llevar un bien rato (casi dos horas), y mientras tanto me voy a dedicar de desembarlar y ensamblar la R3 (esto lo cuento en la siguiente sección), y entonces volveré. Una vez que ha pasado el tiempo, otro buen comando para preguntarle al disco por su estado es el e(x)tended (a)ll. Y podemos añadir el tipo de disposito, que es SAT.

earth:~ # smartctl --device=sat --xall /dev/sdb
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.79-18.26-default] (SUSE RPM)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.10
Device Model:     ST3320820A
Serial Number:    12345678
Firmware Version: 3.AAD
User Capacity:    320,072,933,376 bytes [320 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA/ATAPI-7 (minor revision not indicated)
Local Time is:    Fri Aug 25 15:15:15 2017 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM feature is:   Unavailable
Rd look-ahead is: Enabled
Write cache is:   Enabled
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Unavailable

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      ( 121) The previous self-test completed having
                                        the read element of the test failed.
Total time to complete Offline
data collection:                (  430) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 115) minutes.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR--   101   093   006    -    89908515
  3 Spin_Up_Time            PO----   094   090   000    -    0
  4 Start_Stop_Count        -O--CK   100   100   020    -    867
  5 Reallocated_Sector_Ct   PO--CK   100   100   036    -    0
  7 Seek_Error_Rate         POSR--   081   060   030    -    132019117
  9 Power_On_Hours          -O--CK   098   098   000    -    2113
 10 Spin_Retry_Count        PO--C-   100   100   097    -    0
 12 Power_Cycle_Count       -O--CK   099   099   020    -    1470
187 Reported_Uncorrect      -O--CK   092   092   000    -    8
189 High_Fly_Writes         -O-RCK   100   100   000    -    0
190 Airflow_Temperature_Cel -O---K   054   050   045    -    46 (Min/Max 27/48)
194 Temperature_Celsius     -O---K   046   050   000    -    46 (0 17 0 0 0)
195 Hardware_ECC_Recovered  -O-RC-   059   052   000    -    58061254
197 Current_Pending_Sector  -O--C-   100   100   000    -    1
198 Offline_Uncorrectable   ----C-   100   100   000    -    1
199 UDMA_CRC_Error_Count    -OSRCK   200   200   000    -    3
200 Multi_Zone_Error_Rate   ------   100   253   000    -    0
202 Data_Address_Mark_Errs  -O--CK   100   253   000    -    0
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01       GPL,SL  R/O      1  Summary SMART error log
0x02       GPL,SL  R/O      5  Comprehensive SMART error log
0x03       GPL,SL  R/O      5  Ext. Comprehensive SMART error log
0x06       GPL,SL  R/O      1  SMART self-test log
0x07       GPL,SL  R/O      1  Extended self-test log
0x09       GPL,SL  R/W      1  Selective self-test log
0x20       GPL,SL  R/O      1  Streaming performance log [OBS-8]
0x21       GPL,SL  R/O      1  Write stream error log
0x22       GPL,SL  R/O      1  Read stream error log
0x23       GPL,SL  R/O      1  Delayed sector log [OBS-8]
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xa0       GPL,SL  VS       1  Device vendor specific log
0xa1       GPL,SL  VS      20  Device vendor specific log
0xa2       GPL,SL  VS     101  Device vendor specific log
0xa8       GPL,SL  VS      20  Device vendor specific log
0xa9       GPL,SL  VS       1  Device vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer
0xff       GPL     -    23552  Reserved

SMART Extended Comprehensive Error Log Version: 1 (5 sectors)
Device Error Count: 8
        CR     = Command Register
        FEATR  = Features Register
        COUNT  = Count (was: Sector Count) Register
        LBA_48 = Upper bytes of LBA High/Mid/Low Registers ]  ATA-8
        LH     = LBA High (was: Cylinder High) Register    ]   LBA
        LM     = LBA Mid (was: Cylinder Low) Register      ] Register
        LL     = LBA Low (was: Sector Number) Register     ]
        DV     = Device (was: Device/Head) Register
        DC     = Device Control Register
        ER     = Error register
        ST     = Status register
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 8 [7] occurred at disk power-on lifetime: 2103 hours (87 days + 15 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 00 00 00 1f 58 33 d8 33 e0 00  Error: UNC at LBA = 0x1f5833d833 = 134623778867

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  25 00 da 00 08 00 1f 58 00 d8 30 e0 00     02:59:21.577  READ DMA EXT
  25 00 da 00 08 00 1f 58 00 d8 30 e0 00     02:59:21.575  READ DMA EXT
  25 00 da 00 08 00 1f 58 00 d8 28 e0 00     02:59:21.571  READ DMA EXT
  25 00 da 00 10 00 1f 59 00 d8 98 e0 00     02:59:21.568  READ DMA EXT
  25 00 da 00 f0 00 1f 58 00 d8 a8 e0 00     02:59:21.566  READ DMA EXT

Error 7 [6] ...
Error 6 [5] ...
...
Error 1 [0] occurred at disk power-on lifetime: 2103 hours (87 days + 15 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 00 00 00 1f 58 33 d8 33 e0 00  Error: UNC at LBA = 0x1f5833d833 = 134623778867

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  25 00 da 00 f0 00 1f 58 00 d8 00 e0 00     02:59:21.577  READ DMA EXT
  25 00 da 00 20 00 1f 57 00 d8 e0 e0 00     02:59:21.575  READ DMA EXT
  25 00 da 00 f0 00 1f 56 00 d8 f0 e0 00     02:59:21.571  READ DMA EXT
  25 00 da 00 f0 00 1f 56 00 d8 00 e0 00     02:59:21.568  READ DMA EXT
  25 00 da 00 20 00 1f 55 00 d8 e0 e0 00     02:59:21.566  READ DMA EXT

SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       90%      2109         29464009922611
# 2  Extended offline    Completed: read failure       90%      2086         27616703817874
# 3  Short offline       Completed without error       00%      2086         -
# 4  Short offline       Completed without error       00%      1414         -
# 5  Short offline       Completed without error       00%      1394         -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Commands not supported

Device Statistics (GP/SMART Log 0x04) not supported

SATA Phy Event Counters (GP Log 0x11) not supported

Bueno, no son muy buenas noticias, porque algo salió mal durante la prueba. Esto hace que tenga que abrir la boca de mis bolsillos y comprarme un disco duro nuevo, porque tengo la intención de hacer este proyecto algo estable. Pero mientras tanto, vamos a sacarle partido a este.

En cuanto consiga el siguiente disco duro, actualizaré la entrada en el blog :)

Leer la entrada anterior sobre como escoger el hardware [←].
Leer la sección siguiente [←]

Comentarios