個人覺得Support的回信內容太過草率,感覺好像為了什麼事勿忙回覆結案。
2/23年假回來後,收到NAS的警告信"第 1 顆硬碟發生 S.M.A.R.T. 檢測錯誤"
前2張圖很直覺是SMART有偵測到錯誤
但第3張圖的狀態又全都是OK,就讓人不解了 (這問題應該跟RD比較相關)
心想是誤判了?還是有"SMART資訊"有Bug?
<↓圖1↓>

<↓圖2↓>

<↓圖3↓>

2/25上官論填了回報單,為了讓support能清楚了解,也截了圖上傳。在今天2/26收到回信,看完信後,對底下4點不滿意
1, 雖指出編號1,但希望能連編號1的"Raw_Read_Error_Rate"也點出
2, 編號1的數值不為"0"代表您的硬碟確實是有問題,希望能指出是是那個值非0(現值or最差值or..)
3, 承上,既然是編號1有問題,為什麼在圖3的狀況是OK?並沒有說明
4, SMART彙總有異常,但該SMART資訊裡狀態又全是OK,也沒有說明

自己google了一些資料,SMART裡頭的現值(或最差值)低於臨介值時,才表示有異常
圖3裡Raw_Read_Error_Rate的現值與最差值都是200,高於臨介值51,狀況OK,看起來是合理的
但這樣更讓我不懂Support的回覆是不是全然正確,有請各路高手解答,謝謝。
=============2015.02.27 02:19更新分隔線=============
昨天傍晚5, 6點時跟客服通了2次信,主要是提供debug檔供客服分析,客服也在分析後給出結論
原文如下,我的問題都得到解答了,接下來就要把這個有問題的WD 4TB RED(WD40EFRX-68WT0N0)拔出來RMA了
在您的附件中我們有發現此指令:
smartctl -H -d ata /dev/sda
您可以試著運行此指令:
smartctl -d ata -a /dev/sda 就會得到以下資訊,另外請您注意紅色文字,該處為S.M.A.R.T.的異常原因
=====================================================
martctl 6.2 (build date Jan 7 2015) [ppc-linux-2.6.32.12] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model: WDC WD40EFRX-68WT0N0
Serial Number: WD-WCC4E7SY4RP1
LU WWN Device Id: 5 0014ee 20b44a248
Firmware Version: 82.00A82
User Capacity: 4,000,787,030,016 bytes [4.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5400 rpm
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-2 (minor revision not indicated)
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Thu Feb 26 17:20:14 2015 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 121) The previous self-test completed having
the read element of the test failed.
Total time to complete Offline
data collection: (52560) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 526) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x703d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 3
3 Spin_Up_Time 0x0027 253 176 021 Pre-fail Always - 2941
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 90
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 098 098 000 Old_age Always - 1827
10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 64
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 5
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 368
194 Temperature_Celsius 0x0022 122 119 000 Old_age Always - 30
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 90% 1797 9437194 (此處說明S.M.A.R.T.進行到10%即失敗)
# 2 Short offline Completed: read failure 40% 1759 9437194 (此處說明S.M.A.R.T.進行到60%即失敗)
# 3 Short offline Completed: read failure 40% 1741 9437194 (此處說明S.M.A.R.T.進行到60%即失敗)
# 4 Short offline Completed without error 00% 1573 -
# 5 Short offline Completed without error 00% 1405 -
# 6 Short offline Completed without error 00% 1237 -
# 7 Extended offline Completed without error 00% 1228 -
# 8 Short offline Completed without error 00% 1069 -
# 9 Short offline Completed without error 00% 902 -
#10 Short offline Completed without error 00% 735 -
#11 Short offline Completed without error 00% 573 -
#12 Extended offline Completed without error 00% 492 -
#13 Short offline Completed without error 00% 406 -
#14 Short offline Completed without error 00% 238 -
#15 Short offline Completed without error 00% 73 -
#16 Extended offline Completed without error 00% 31 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
=====================================================
您的S.M.A.R.T.資訊中的每個項目是由各個廠商自行設定判斷標準,若廠商設定標準過於寬鬆則該狀態可能仍然是"OK"
而實際上您UI上的異常是因為上列紅色文字異常處導致S.M.A.R.T.狀態顯示"異常",我們建議您更換此硬碟
希望以上資訊對您的問題有幫助,謝謝。
群暉技術支援工程師 謹上




























































































