Some checks failed
CI / skip-ci-check (pull_request) Successful in 6s
CI / lint-and-test (pull_request) Failing after 9s
CI / ansible-validation (pull_request) Failing after 6s
CI / secret-scanning (pull_request) Successful in 5s
CI / dependency-scan (pull_request) Successful in 8s
CI / sast-scan (pull_request) Failing after 5s
CI / license-check (pull_request) Successful in 11s
CI / vault-check (pull_request) Failing after 6s
CI / playbook-test (pull_request) Failing after 6s
CI / container-scan (pull_request) Failing after 6s
CI / sonar-analysis (pull_request) Failing after 2s
CI / workflow-summary (pull_request) Successful in 4s
Document pve10 static IPs, monitoring stack, and site LXCs; add portfolio to inventory; Mailcow mailbox automation; vault import/export scripts; security audit guides and UniFi DHCP reference. Co-authored-by: Cursor <cursoragent@cursor.com>
233 lines
9.5 KiB
Markdown
233 lines
9.5 KiB
Markdown
# NAS.SP00 SMART audit
|
|
|
|
**Date:** 2026-05-21
|
|
**Host:** PVENAS (Proxmox VE) — `10.0.10.10`
|
|
**Pool:** ZFS `NAS.SP00`
|
|
**Related:** [nas-sp00-drive-failure-report.md](nas-sp00-drive-failure-report.md)
|
|
|
|
---
|
|
|
|
## Executive summary
|
|
|
|
| Serial | Device | Capacity | ZFS (mirror) | SMART health |
|
|
|--------|--------|----------|--------------|--------------|
|
|
| W4J0L0BA | sda | 5.00 TB | mirror-0 ONLINE | **PASSED** |
|
|
| W4J0L3PY | sdb | **137 GB** | mirror-0 UNAVAIL | **UNKNOWN** (read fails) |
|
|
| W4J0K9V7 | sdc | 5.00 TB | mirror-1 ONLINE | **PASSED** |
|
|
| W4J0LKCD | sdd | 5.00 TB | mirror-1 ONLINE | **PASSED** |
|
|
|
|
Pool state at audit time: **DEGRADED** — failed leg `W4J0L3PY` (`/dev/sdb`). No known data errors. Three healthy drives show no reallocated, pending, or uncorrectable sectors.
|
|
|
|
---
|
|
|
|
## ZFS pool status
|
|
|
|
```
|
|
pool: NAS.SP00
|
|
state: DEGRADED
|
|
status: One or more devices could not be used because the label is missing or
|
|
invalid. Sufficient replicas exist for the pool to continue
|
|
functioning in a degraded state.
|
|
action: Replace the device using 'zpool replace'.
|
|
scan: resilvered 0B in 00:00:01 with 0 errors on Thu May 21 21:27:54 2026
|
|
|
|
NAME STATE READ WRITE CKSUM
|
|
NAS.SP00 DEGRADED 0 0 0
|
|
mirror-0 DEGRADED 0 0 0
|
|
ata-ST5000DM000-1FK178_W4J0L0BA ONLINE 0 0 0
|
|
11449632222283419591 UNAVAIL 0 0 0 was /dev/disk/by-id/ata-ST5000DM000-1FK178_W4J0L3PY-part1
|
|
mirror-1 ONLINE 0 0 0
|
|
ata-ST5000DM000-1FK178_W4J0LKCD ONLINE 0 0 0
|
|
ata-ST5000DM000-1FK178_W4J0K9V7 ONLINE 0 0 0
|
|
|
|
errors: No known data errors
|
|
```
|
|
|
|
---
|
|
|
|
## Block devices (`lsblk`)
|
|
|
|
| NAME | SIZE | MODEL | SERIAL | ROTA |
|
|
|------|------|-------|--------|------|
|
|
| sda | 4.5T | ST5000DM000-1FK178 | W4J0L0BA | 1 |
|
|
| sdb | 3.9G | ST5000DM000 | W4J0L3PY | 1 |
|
|
| sdc | 4.5T | ST5000DM000-1FK178 | W4J0K9V7 | 1 |
|
|
| sdd | 4.5T | ST5000DM000-1FK178 | W4J0LKCD | 1 |
|
|
|
|
---
|
|
|
|
## Healthy drives — key metrics
|
|
|
|
| Metric | sda (W4J0L0BA) | sdc (W4J0K9V7) | sdd (W4J0LKCD) |
|
|
|--------|----------------|----------------|----------------|
|
|
| Model | ST5000DM000-1FK178 | ST5000DM000-1FK178 | ST5000DM000-1FK178 |
|
|
| Firmware | CC48 | CC48 | CC48 |
|
|
| WWN | 5000c500082c02f61 | 5000c500082c7e2ce | 5000c500082d84c45 |
|
|
| Rotation | 5980 rpm | 5980 rpm | 5980 rpm |
|
|
| SATA | 3.1 @ 6.0 Gb/s | 3.1 @ 6.0 Gb/s | 3.1 @ 6.0 Gb/s |
|
|
| Power-on hours | 52,481 (~6.0 y) | 53,087 (~6.1 y) | 45,580 (~5.2 y) |
|
|
| Temperature | 27 °C | 30 °C | 30 °C |
|
|
| Reallocated sectors | 0 | 0 | 0 |
|
|
| Current pending sectors | 0 | 0 | 0 |
|
|
| Offline uncorrectable | 0 | 0 | 0 |
|
|
| UDMA CRC errors | 0 | 0 | 0 |
|
|
| Start/stop count | 350 | 367 | 310 |
|
|
| Load cycle count | 348,974 | 340,961 | 184,891 |
|
|
| Power cycle count | 345 | 363 | 309 |
|
|
|
|
High **Load_Cycle_Count** on Seagate Desktop HDD.15 is common (head parking); not alarming when reallocated/pending counts remain zero.
|
|
|
|
---
|
|
|
|
## Failed drive — `/dev/sdb` (W4J0L3PY)
|
|
|
|
### Identity
|
|
|
|
| Field | Value |
|
|
|-------|-------|
|
|
| Device Model | ST5000DM000 (truncated; not full -1FK178 suffix) |
|
|
| Serial | W4J0L3PY |
|
|
| WWN | 5000c500082cc8bbb |
|
|
| Firmware | CC48 |
|
|
| User capacity | 137,438,952,960 bytes [**137 GB**] |
|
|
| Expected capacity | 5,000,981,078,016 bytes [5.00 TB] |
|
|
| Rotation | 7200 rpm (reported) |
|
|
| SATA | 3.0, 6.0 Gb/s |
|
|
|
|
### SMART
|
|
|
|
```
|
|
Read SMART Data failed: scsi error aborted command
|
|
SMART Status command failed: scsi error aborted command
|
|
SMART overall-health self-assessment test result: UNKNOWN!
|
|
SMART Status, Attributes and Thresholds cannot be read.
|
|
```
|
|
|
|
**Action:** Replace drive; see [nas-sp00-drive-failure-report.md](nas-sp00-drive-failure-report.md).
|
|
|
|
---
|
|
|
|
## Full SMART attributes (healthy drives)
|
|
|
|
### `/dev/sda` — W4J0L0BA (mirror-0, ONLINE)
|
|
|
|
```
|
|
SMART overall-health self-assessment test result: PASSED
|
|
|
|
ID# ATTRIBUTE_NAME VALUE WORST THRESH TYPE RAW_VALUE
|
|
1 Raw_Read_Error_Rate 119 100 006 Pre-fail 211189952
|
|
3 Spin_Up_Time 092 091 000 Pre-fail 0
|
|
4 Start_Stop_Count 100 100 020 Old_age 350
|
|
5 Reallocated_Sector_Ct 100 100 010 Pre-fail 0
|
|
7 Seek_Error_Rate 080 060 030 Pre-fail 43979429424
|
|
9 Power_On_Hours 041 041 000 Old_age 52481
|
|
10 Spin_Retry_Count 100 100 097 Pre-fail 0
|
|
12 Power_Cycle_Count 100 100 020 Old_age 345
|
|
183 Runtime_Bad_Block 100 100 000 Old_age 0
|
|
184 End-to-End_Error 100 100 099 Old_age 0
|
|
187 Reported_Uncorrect 100 100 000 Old_age 0
|
|
188 Command_Timeout 100 099 000 Old_age 3 3 3
|
|
189 High_Fly_Writes 100 100 000 Old_age 0
|
|
190 Airflow_Temperature_Cel 073 058 045 Old_age 27 (Min/Max 27/28)
|
|
191 G-Sense_Error_Rate 100 100 000 Old_age 0
|
|
192 Power-Off_Retract_Count 100 100 000 Old_age 0
|
|
193 Load_Cycle_Count 001 001 000 Old_age 348974
|
|
194 Temperature_Celsius 027 042 000 Old_age 27
|
|
195 Hardware_ECC_Recovered 119 100 000 Old_age 211189952
|
|
197 Current_Pending_Sector 100 100 000 Old_age 0
|
|
198 Offline_Uncorrectable 100 100 000 Old_age 0
|
|
199 UDMA_CRC_Error_Count 200 200 000 Old_age 0
|
|
240 Head_Flying_Hours 100 253 000 Old_age 15140h+51m+12.276s
|
|
241 Total_LBAs_Written 100 253 000 Old_age 57665101118
|
|
242 Total_LBAs_Read 100 253 000 Old_age 160962549062
|
|
```
|
|
|
|
### `/dev/sdc` — W4J0K9V7 (mirror-1, ONLINE)
|
|
|
|
```
|
|
SMART overall-health self-assessment test result: PASSED
|
|
|
|
ID# ATTRIBUTE_NAME VALUE WORST THRESH TYPE RAW_VALUE
|
|
1 Raw_Read_Error_Rate 117 100 006 Pre-fail 136042192
|
|
3 Spin_Up_Time 092 091 000 Pre-fail 0
|
|
4 Start_Stop_Count 100 100 020 Old_age 367
|
|
5 Reallocated_Sector_Ct 100 100 010 Pre-fail 0
|
|
7 Seek_Error_Rate 083 060 030 Pre-fail 22512744055
|
|
9 Power_On_Hours 040 040 000 Old_age 53087
|
|
10 Spin_Retry_Count 100 100 097 Pre-fail 0
|
|
12 Power_Cycle_Count 100 100 020 Old_age 363
|
|
183 Runtime_Bad_Block 100 100 000 Old_age 0
|
|
184 End-to-End_Error 100 100 099 Old_age 0
|
|
187 Reported_Uncorrect 100 100 000 Old_age 0
|
|
188 Command_Timeout 100 099 000 Old_age 6 6 12
|
|
189 High_Fly_Writes 096 096 000 Old_age 4
|
|
190 Airflow_Temperature_Cel 070 060 045 Old_age 30 (Min/Max 28/30)
|
|
191 G-Sense_Error_Rate 100 100 000 Old_age 0
|
|
192 Power-Off_Retract_Count 100 100 000 Old_age 0
|
|
193 Load_Cycle_Count 001 001 000 Old_age 340961
|
|
194 Temperature_Celsius 030 040 000 Old_age 30
|
|
195 Hardware_ECC_Recovered 117 100 000 Old_age 136042192
|
|
197 Current_Pending_Sector 100 100 000 Old_age 0
|
|
198 Offline_Uncorrectable 100 100 000 Old_age 0
|
|
199 UDMA_CRC_Error_Count 200 200 000 Old_age 0
|
|
240 Head_Flying_Hours 100 253 000 Old_age 15859h+53m+20.869s
|
|
241 Total_LBAs_Written 100 253 000 Old_age 57609506493
|
|
242 Total_LBAs_Read 100 253 000 Old_age 152392393081
|
|
```
|
|
|
|
### `/dev/sdd` — W4J0LKCD (mirror-1, ONLINE)
|
|
|
|
```
|
|
SMART overall-health self-assessment test result: PASSED
|
|
|
|
ID# ATTRIBUTE_NAME VALUE WORST THRESH TYPE RAW_VALUE
|
|
1 Raw_Read_Error_Rate 116 090 006 Pre-fail 108217848
|
|
3 Spin_Up_Time 092 091 000 Pre-fail 0
|
|
4 Start_Stop_Count 100 100 020 Old_age 310
|
|
5 Reallocated_Sector_Ct 100 100 010 Pre-fail 0
|
|
7 Seek_Error_Rate 073 051 030 Pre-fail 185584998742
|
|
9 Power_On_Hours 048 048 000 Old_age 45580
|
|
10 Spin_Retry_Count 100 100 097 Pre-fail 0
|
|
12 Power_Cycle_Count 100 100 020 Old_age 309
|
|
183 Runtime_Bad_Block 100 100 000 Old_age 0
|
|
184 End-to-End_Error 100 100 099 Old_age 0
|
|
187 Reported_Uncorrect 100 100 000 Old_age 0
|
|
188 Command_Timeout 100 099 000 Old_age 8 8 14
|
|
189 High_Fly_Writes 098 098 000 Old_age 2
|
|
190 Airflow_Temperature_Cel 070 050 045 Old_age 30 (Min/Max 29/30)
|
|
191 G-Sense_Error_Rate 100 100 000 Old_age 0
|
|
192 Power-Off_Retract_Count 100 100 000 Old_age 0
|
|
193 Load_Cycle_Count 008 008 000 Old_age 184891
|
|
194 Temperature_Celsius 030 050 000 Old_age 30
|
|
195 Hardware_ECC_Recovered 116 100 000 Old_age 108217848
|
|
197 Current_Pending_Sector 100 091 000 Old_age 0
|
|
198 Offline_Uncorrectable 100 091 000 Old_age 0
|
|
199 UDMA_CRC_Error_Count 200 200 000 Old_age 0
|
|
240 Head_Flying_Hours 100 253 000 Old_age 11604h+15m+50.842s
|
|
241 Total_LBAs_Written 100 253 000 Old_age 72962800596
|
|
242 Total_LBAs_Read 100 253 000 Old_age 167268621195
|
|
```
|
|
|
|
---
|
|
|
|
## How this audit was collected
|
|
|
|
On PVENAS as root:
|
|
|
|
```bash
|
|
zpool status NAS.SP00
|
|
lsblk -d -o NAME,SIZE,MODEL,SERIAL,ROTA,STATE /dev/sd{a,b,c,d}
|
|
for d in sda sdb sdc sdd; do smartctl -i -H -A /dev/$d; done
|
|
```
|
|
|
|
Audit timestamp (host local): Thu May 21 22:13:58 2026 EDT.
|
|
|
|
---
|
|
|
|
## Next steps
|
|
|
|
1. Replace **W4J0L3PY** with a 5 TB+ NAS-class HDD (match ST5000DM000-1FK178 or better).
|
|
2. `zpool replace NAS.SP00` with the new disk by-id.
|
|
3. Monitor resilver; run `zpool scrub NAS.SP00` after pool is **ONLINE**.
|
|
4. Re-run SMART audit after replacement for a clean baseline.
|