[arch-general] [OT]Disk showing too many bad sectors - is it going to fail ?
I have a Seagate Barracuda 7200.9 family harddisk of 160gb capacity which is more than 3 years old. recently I installed gsmartcontrol and after running it, it showed number of current pending sector count and offline uncorrectable sectors as -- 4294967295.Fearing the disk going bad, i backed up data and started monitoring it. This is the latest gsmartcontrol shows - http://img820.imageshack.us/img820/5707/201010161300051280x1024.png After two weeks, the no. of bad sectors is the same and the disk shows no sign of failure (at the least there is no problem to read or write files from or to the disk).hence I am confused .is the disk going bad ? if it is fine why can't it remap all those bad sectors ?
On 10/16/2010 08:46 AM, Partha Chowdhury wrote:
I have a Seagate Barracuda 7200.9 family harddisk of 160gb capacity which is more than 3 years old. recently I installed gsmartcontrol and after running it, it showed number of current pending sector count and offline uncorrectable sectors as -- 4294967295.Fearing the disk going bad, i backed up data and started monitoring it. This is the latest gsmartcontrol shows - http://img820.imageshack.us/img820/5707/201010161300051280x1024.png
After two weeks, the no. of bad sectors is the same and the disk shows no sign of failure (at the least there is no problem to read or write files from or to the disk).hence I am confused .is the disk going bad ? if it is fine why can't it remap all those bad sectors ?
That looks a little odd, however the drive does not seem to report that those attributes are failing (value is above threshold). If I'm not mistaken, some drives report strange information in some of the smart fields, it might be your case. I have an HD that once reported non zero pending sector count and offline uncorrectable sectors. In my case I believe the sectors were not damaged, just contained erroneous data for some reason, because writing over them solved the problem, the reallocated sector count didn't increase while doing so and the pending sector count and offline uncorrectable sectors went back to zero after that. If you can, try overwriting the whole disk with dd, or test it with badblocks using the write and readback test and see if anything changes. The not so worst case is that you caught an impending disk faillure before it caused trouble and you already have a backup, the best case is that you find out those values are bogus and should not be taken into account. -- Mauro Santos
On 10/16/2010 02:46 AM, Partha Chowdhury wrote:
I have a Seagate Barracuda 7200.9 family harddisk of 160gb capacity which is more than 3 years old. recently I installed gsmartcontrol and after running it, it showed number of current pending sector count and offline uncorrectable sectors as -- 4294967295.Fearing the disk going bad, i backed up data and started monitoring it. This is the latest gsmartcontrol shows - http://img820.imageshack.us/img820/5707/201010161300051280x1024.png
After two weeks, the no. of bad sectors is the same and the disk shows no sign of failure (at the least there is no problem to read or write files from or to the disk).hence I am confused .is the disk going bad ? if it is fine why can't it remap all those bad sectors ?
I have found the smart reporting of seagate barracuda drives to be flaky at best. (but I always back up and monitor as well) I have 4 spinning in 2 dmraid arrays on a backup server, and sometimes it just throws errors. (it has done that for years) Drive info is: === START OF INFORMATION SECTION === Model Family: Seagate Barracuda 7200.11 family Device Model: ST3750330AS Serial Number: 5QK0Q09G Firmware Version: SD1A User Capacity: 750,156,374,016 bytes Device is: In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: ATA-8-ACS revision 4 Local Time is: Sat Oct 16 11:59:51 2010 CDT SMART support is: Available - device has SMART capability. SMART support is: Enabled There have been a number of firmware changes/updates for seagate drives over the past 3 years and several "bad runs" of disks. Check the seagate support site and make sure you have the latest firmware for your drive. I have had the bad sector errors - sometimes a true failure, sometimes not. Just backup, monitor and if you continue to get the errors, drop of $50 on a new 1T drive. -- David C. Rankin, J.D.,P.E. Rankin Law Firm, PLLC 510 Ochiltree Street Nacogdoches, Texas 75961 Telephone: (936) 715-9333 Facsimile: (936) 715-9339 www.rankinlawfirm.com
On Sat, Oct 16, 2010 at 01:48:06PM +0100, Mauro Santos wrote:
If you can, try overwriting the whole disk with dd, or test it with badblocks using the write and readback test and see if anything changes. The not so worst case is that you caught an impending disk faillure before it caused trouble and you already have a backup, the best case is that you find out those values are bogus and should not be taken into account.
i overwrote the whole disk with ddrescue -f /dev/zero /dev/sdb.After one and a half hours later it stopped with the message "no space left on device" - i guess it indicates no problem ? i also tried the badblocks program with -w option. It took a long time 5+ hours but did not report a bad sector. On Sat, Oct 16, 2010 at 12:17:50PM -0500, David C. Rankin wrote:
There have been a number of firmware changes/updates for seagate drives over the past 3 years and several "bad runs" of disks. Check the seagate support site and make sure you have the latest firmware for your drive. I have had the bad sector errors - sometimes a true failure, sometimes not. Just backup, monitor and if you continue to get the errors, drop of $50 on a new 1T drive.
I checked the seagate site and there is no firmware upgrade for this model. On further googling, i found that seagate is only offering firmware upgrades for 7200.12 model onwards. Now to be absolutely sure, i downloaded the seatools program and it ran a short and long test which both said PASSED. Inspite of all these, gsmartcontrol shows the same. What are the indications before a disk is going bad which a normal user can catch with bare eyes and ears ?
2010/10/17 Partha Chowdhury <partha@gmx.us>:
On Sat, Oct 16, 2010 at 01:48:06PM +0100, Mauro Santos wrote:
What are the indications before a disk is going bad which a normal user can catch with bare eyes and ears ?
Drive clicking is usually a very strong indicator, also called the click of death. A very strong *click* heard in repetition, meaning the drive is constantly trying to read a sector that is somehow not readable. Once this starts happening, replacing the drive is wise.
On 17-10-2010 15:55, Partha Chowdhury wrote:
i overwrote the whole disk with ddrescue -f /dev/zero /dev/sdb.After one and a half hours later it stopped with the message "no space left on device" - i guess it indicates no problem ?
i also tried the badblocks program with -w option. It took a long time 5+ hours but did not report a bad sector.
It seems to indicate that everything is ok
I checked the seagate site and there is no firmware upgrade for this model. On further googling, i found that seagate is only offering firmware upgrades for 7200.12 model onwards.
Now to be absolutely sure, i downloaded the seatools program and it ran a short and long test which both said PASSED.
This reinforces the conclusion taken after the previous tests. If I'm not wrong the test seatools performs is just issue a smart long test, which you can also do with smartctl (and also check the test log), this is a read only test as far as I know. In the smart test log you can also see the addresses of the current pending sector count and offline uncorrectable sectors if they are not too many and if the drive returns sane data (which doesn't seem to be the case).
Inspite of all these, gsmartcontrol shows the same.
What are the indications before a disk is going bad which a normal user can catch with bare eyes and ears ?
The first indication would be frequent hangups or trouble reading some files and errors on dmesg about not being able to read sector xyz and some (s)ata code error. However like I said before, I've experienced a case where a write to the affected sectors solved the problem, but if the problem is serious then at best the sector will be marked as damaged and reallocated if there is still spare space for reallocation. If you start to see the reallocated sector count (and reallocation event count on some drives) increase then better backup everything. Also keep an eye on all pre-failure attributes, if any of those says failing or failed in the past I think it's bad news. You may also hear some abnormal clicking, however if you hear this the drive is probably already way past any possibility of data recovery at home. Mind you that the sometimes drives fail without warning or without any change in the smart attributes. This [1] is an interesting read. [1] http://labs.google.com/papers/disk_failures.pdf -- Mauro Santos
participants (4)
-
David C. Rankin
-
Mauro Santos
-
Partha Chowdhury
-
Stefan Erik Wilkens