Re: [arch-general] [arch-dev-public] Load_Cycle_Count and storage-fixup
Anyone else has some opinion about how to handle this?
I'd like to affirm the opinions of Roman and Xavier and take some action on this.
Anyone object to my putting storage-fixup in [extra] at least? If no objections by W 9/21, I plan to go ahead with that step. If it works out, we can talk about follow-up steps like:
1) moving it to [core] 2) integrating it into default rc.d scripts
It's a pretty serious issue for laptop users with affected drives. And the drives are pretty popular ones, methinks.
Also note that, when raising awerness about this issue, the fixup ussually needs to be run on resume from disk (and I think resume from ram) too. Not only on boot. -- damjan
On Mon, 2009-10-19 at 02:54 +0200, Damjan Georgievski wrote:
Anyone else has some opinion about how to handle this?
I'd like to affirm the opinions of Roman and Xavier and take some action on this.
Anyone object to my putting storage-fixup in [extra] at least? If no objections by W 9/21, I plan to go ahead with that step. If it works out, we can talk about follow-up steps like:
1) moving it to [core] 2) integrating it into default rc.d scripts
It's a pretty serious issue for laptop users with affected drives. And the drives are pretty popular ones, methinks.
Also note that, when raising awerness about this issue, the fixup ussually needs to be run on resume from disk (and I think resume from ram) too. Not only on boot.
Isn't all this handled simply by getting laptop-mode to do it? Its possible to allow laptop-mode to control hdparm settings, after all. Besides, this is mainly (exclusively?) a problem for laptop hard discs.
laptop-mode by itself won't do it, but laptop-mode-tools will. However, some users (such as I) see laptop-mode-tools as bloat because it comes with all this other stuff for controlling other aspects of power consumption.
2009/10/19 Ng Oon-Ee <ngoonee@gmail.com>:
On Mon, 2009-10-19 at 02:54 +0200, Damjan Georgievski wrote:
Anyone else has some opinion about how to handle this?
I'd like to affirm the opinions of Roman and Xavier and take some action on this.
Thanks for picking up this topic!
Anyone object to my putting storage-fixup in [extra] at least? If no objections by W 9/21, I plan to go ahead with that step. If it works out, we can talk about follow-up steps like:
1) moving it to [core] 2) integrating it into default rc.d scripts
It's a pretty serious issue for laptop users with affected drives. And the drives are pretty popular ones, methinks.
No objections, it seems, go for it!
Also note that, when raising awerness about this issue, the fixup ussually needs to be run on resume from disk (and I think resume from ram) too. Not only on boot.
Good point!
Isn't all this handled simply by getting laptop-mode to do it? Its possible to allow laptop-mode to control hdparm settings, after all.
IIRC laptop-mode-tools can fix this if configured correctly, but this is not the right solution IMO.
Besides, this is mainly (exclusively?) a problem for laptop hard discs.
I'm not sure if this is a (common enough) problem for 3.5" (non-laptop) HDDs, but it's worth noting that 2.5" (AKA "laptop") HDDs are used not only in laptops: Mini-ITX boxes, NAS boxes, HDTV players, even some servers - all have 2.5" HDDs quite often. -- Roman Kyrylych (Роман Кирилич)
On Wed, Oct 21, 2009 at 4:07 PM, Roman Kyrylych <roman.kyrylych@gmail.com> wrote:
2009/10/19 Ng Oon-Ee <ngoonee@gmail.com>:
On Mon, 2009-10-19 at 02:54 +0200, Damjan Georgievski wrote:
Anyone else has some opinion about how to handle this?
I'd like to affirm the opinions of Roman and Xavier and take some action on this.
Thanks for picking up this topic!
Anyone object to my putting storage-fixup in [extra] at least? If no objections by W 9/21, I plan to go ahead with that step. If it works out, we can talk about follow-up steps like:
1) moving it to [core] 2) integrating it into default rc.d scripts
It's a pretty serious issue for laptop users with affected drives. And the drives are pretty popular ones, methinks.
No objections, it seems, go for it!
Also note that, when raising awerness about this issue, the fixup ussually needs to be run on resume from disk (and I think resume from ram) too. Not only on boot.
Good point!
Isn't all this handled simply by getting laptop-mode to do it? Its possible to allow laptop-mode to control hdparm settings, after all.
IIRC laptop-mode-tools can fix this if configured correctly, but this is not the right solution IMO.
Besides, this is mainly (exclusively?) a problem for laptop hard discs.
I'm not sure if this is a (common enough) problem for 3.5" (non-laptop) HDDs, but it's worth noting that 2.5" (AKA "laptop") HDDs are used not only in laptops: Mini-ITX boxes, NAS boxes, HDTV players, even some servers - all have 2.5" HDDs quite often.
What ever happened to this issue? I've been trying to follow it, but got lost with other things. Do we have storage-fixup anywhere? Is there a wiki page on this info?
On Wed, Oct 28, 2009 at 3:04 PM, Aaron Griffin <aaronmgriffin@gmail.com> wrote:
On Wed, Oct 21, 2009 at 4:07 PM, Roman Kyrylych <roman.kyrylych@gmail.com> wrote:
2009/10/19 Ng Oon-Ee <ngoonee@gmail.com>:
On Mon, 2009-10-19 at 02:54 +0200, Damjan Georgievski wrote:
Anyone else has some opinion about how to handle this?
I'd like to affirm the opinions of Roman and Xavier and take some action on this.
Thanks for picking up this topic!
Anyone object to my putting storage-fixup in [extra] at least? If no objections by W 9/21, I plan to go ahead with that step. If it works out, we can talk about follow-up steps like:
1) moving it to [core] 2) integrating it into default rc.d scripts
It's a pretty serious issue for laptop users with affected drives. And the drives are pretty popular ones, methinks.
No objections, it seems, go for it!
Also note that, when raising awerness about this issue, the fixup ussually needs to be run on resume from disk (and I think resume from ram) too. Not only on boot.
Good point!
Isn't all this handled simply by getting laptop-mode to do it? Its possible to allow laptop-mode to control hdparm settings, after all.
IIRC laptop-mode-tools can fix this if configured correctly, but this is not the right solution IMO.
Besides, this is mainly (exclusively?) a problem for laptop hard discs.
I'm not sure if this is a (common enough) problem for 3.5" (non-laptop) HDDs, but it's worth noting that 2.5" (AKA "laptop") HDDs are used not only in laptops: Mini-ITX boxes, NAS boxes, HDTV players, even some servers - all have 2.5" HDDs quite often.
What ever happened to this issue? I've been trying to follow it, but got lost with other things. Do we have storage-fixup anywhere? Is there a wiki page on this info?
Paul has added storage-fixup to extra. I think that nothing else has been done.
When I got a new laptop I investigated this problem a little and found that with hdparm -B 254/255 the temperature went up quite significantly. This may be a freak and I would love to know whether there really is something behind it, but when I used -B 200 the temperature increase was clearly smaller, but the load cycle count did not increase!!! Is this actually at all possible? Does the -B option do something other than only affecting head loading? Does anybody know? In looking through the storage-fixup package data I see that always -B 254 or -B 255 is set, so obviously there is no sign of other, possibly more optimal values there.
On Thu, Oct 29, 2009 at 2:08 PM, Michael Towers <larch42@googlemail.com> wrote:
When I got a new laptop I investigated this problem a little and found that with hdparm -B 254/255 the temperature went up quite significantly. This may be a freak and I would love to know whether there really is something behind it, but when I used -B 200 the temperature increase was clearly smaller, but the load cycle count did not increase!!! Is this actually at all possible? Does the -B option do something other than only affecting head loading? Does anybody know?
In looking through the storage-fixup package data I see that always -B 254 or -B 255 is set, so obviously there is no sign of other, possibly more optimal values there.
That sounds like an interesting concern, you might want to ask upstream (= storage-fixup maintainers) about it :)
On Thu, Oct 29, 2009 at 9:31 AM, Xavier <shiningxc@gmail.com> wrote:
On Thu, Oct 29, 2009 at 2:08 PM, Michael Towers <larch42@googlemail.com> wrote:
When I got a new laptop I investigated this problem a little and found that with hdparm -B 254/255 the temperature went up quite significantly. This may be a freak and I would love to know whether there really is something behind it, but when I used -B 200 the temperature increase was clearly smaller, but the load cycle count did not increase!!! Is this actually at all possible? Does the -B option do something other than only affecting head loading? Does anybody know?
In looking through the storage-fixup package data I see that always -B 254 or -B 255 is set, so obviously there is no sign of other, possibly more optimal values there.
That sounds like an interesting concern, you might want to ask upstream (= storage-fixup maintainers) about it :)
Anyone happen to know how often the storage-fixup rules are updated? My Eee drive isn't listed (mine does NOT have an SSD) so I'm not sure what the hdparm params should be.
On Thu, Oct 29, 2009 at 4:26 PM, Aaron Griffin <aaronmgriffin@gmail.com> wrote:
Anyone happen to know how often the storage-fixup rules are updated? My Eee drive isn't listed (mine does NOT have an SSD) so I'm not sure what the hdparm params should be.
There might be a clue from the config file itself : http://git.kernel.org/?p=linux/kernel/git/tj/storage-fixup.git;a=blob_plain;... # If you have a harddrive which does crazy unloading but not listed # here, please write to linux-ide@vger.kernel.org with the outputs of # "dmidecode" and "hdparm -I DRIVE" attached. On a laptop the DRIVE # is usually /dev/sda.
2009/10/29 Aaron Griffin <aaronmgriffin@gmail.com>:
On Thu, Oct 29, 2009 at 9:31 AM, Xavier <shiningxc@gmail.com> wrote:
On Thu, Oct 29, 2009 at 2:08 PM, Michael Towers <larch42@googlemail.com> wrote:
When I got a new laptop I investigated this problem a little and found that with hdparm -B 254/255 the temperature went up quite significantly. This may be a freak and I would love to know whether there really is something behind it, but when I used -B 200 the temperature increase was clearly smaller, but the load cycle count did not increase!!! Is this actually at all possible? Does the -B option do something other than only affecting head loading? Does anybody know?
In looking through the storage-fixup package data I see that always -B 254 or -B 255 is set, so obviously there is no sign of other, possibly more optimal values there.
That sounds like an interesting concern, you might want to ask upstream (= storage-fixup maintainers) about it :)
Anyone happen to know how often the storage-fixup rules are updated? My Eee drive isn't listed (mine does NOT have an SSD) so I'm not sure what the hdparm params should be.
From what I have read on the topic: There is no static answer to "what it should be". The ammount of times / time unit that a drive should cycle is dependant on temperature and actual drive use. If it's sitting still (the laptop it motionless) in a cool area with nothing at all to do, the drive does not have to cycle.
However, if the laptop is in motion (train, car) and the system is hot, cycling the drive can reduce heat output and prevent damage to the drive due to sudden movement. This is why simply setting things to -B 255 or 254 (which disables the feature completely) is not something that should be done without at least informing the user. Through experimentation, I suppose you can find a few values to work with. From the quick glance I took at storage-fixup, it seems to disable the feature completely. Does anybody know if it's more advanced than this or is this the full scope of this script? -- msn: stefan_wilkens@hotmail.com e-mail: stefanwilkens@gmail.com blog: http://www.stefanwilkens.eu/ adres: Lipperkerkstraat 14 7511 DA Enschede
On Thu, Oct 29, 2009 at 6:58 PM, Stefan Erik Wilkens <stefanwilkens@gmail.com> wrote:
Through experimentation, I suppose you can find a few values to work with. From the quick glance I took at storage-fixup, it seems to disable the feature completely. Does anybody know if it's more advanced than this or is this the full scope of this script?
did you look at the config file ? http://git.kernel.org/?p=linux/kernel/git/tj/storage-fixup.git;a=blob_plain;... It contains information about known bad disk and the command to execute for each disk. The script just parses the config file, looks if your disk matches one from the config file, and executes the command from the config.
Given the difficulty of finding the optimal solution to this problem, I think I agree with the earlier suggestion to just monitor the situation and report to the user if there is a problem - and provide a useful account of how to handle it. I imagine it would not be too difficult to write cron scripts to monitor the count. Perhaps one could measure load cycles over the last hour, the last day and the last month, for example. There could be some sort of notification if some threshold or other was overstepped. At this point it is of course less straightforward - what sort of notification should that be? A 'normal' desktop notification would perhaps work for most users? Normally I wouldn't suggest something like this here for fear of getting my knuckles rapped - at present I really don't have time to do it myself - but maybe someone is just itching to get going on a little project like this.
On Thu, Oct 29, 2009 at 8:34 PM, Michael Towers <larch42@googlemail.com> wrote:
Given the difficulty of finding the optimal solution to this problem, I think I agree with the earlier suggestion to just monitor the situation and report to the user if there is a problem - and provide a useful account of how to handle it. I imagine it would not be too difficult to write cron scripts to monitor the count. Perhaps one could measure load cycles over the last hour, the last day and the last month, for example. There could be some sort of notification if some threshold or other was overstepped. At this point it is of course less straightforward - what sort of notification should that be? A 'normal' desktop notification would perhaps work for most users?
Normally I wouldn't suggest something like this here for fear of getting my knuckles rapped - at present I really don't have time to do it myself - but maybe someone is just itching to get going on a little project like this.
That does not sound like a bad idea. And it could benefit to all distro, not only arch.
2009/10/29 Xavier <shiningxc@gmail.com>:
On Thu, Oct 29, 2009 at 6:58 PM, Stefan Erik Wilkens <stefanwilkens@gmail.com> wrote:
Through experimentation, I suppose you can find a few values to work with. From the quick glance I took at storage-fixup, it seems to disable the feature completely. Does anybody know if it's more advanced than this or is this the full scope of this script?
did you look at the config file ? http://git.kernel.org/?p=linux/kernel/git/tj/storage-fixup.git;a=blob_plain;...
It contains information about known bad disk and the command to execute for each disk. The script just parses the config file, looks if your disk matches one from the config file, and executes the command from the config.
I did, yes. it seems to do either -B 254 or 255, which disables the feature completely. some drives disable at 254, others at 255. That seems to be the only difference. What michael towers is suggesting is exactly what should be done IMHO. You can monitor though smartctl or even use smartd, accumulate data and adjust the value to hdparm with, based on the rate that the load_cycle value increases over time. But, again, this leaves us with a few values we have to define as "ok". 1. how many cycles per time is good? drives are made for a certain ammount of cycles(600.000 or 300.000 I believe), devide that against a few years (say 5) to find a value to use as benchmark? Should we make a difference between mobile and stationary systems? 2. we should check if the system is on battery power, that usually means it's mobile and moving (if it's on a desk, it would be on ac). If it is on battery power, we should take into account that more cycles reduces poweruse and reduces the chance of damage due to shocks. Or should we ignore that and stay with the value determined at 1 ? As you can see, there are a few choises that really should be made by the owner of the system. To be honest though. Something that checks / updates to maintain a normal load_cycle average and offers the feature to disable it completely would be better than the current state of storage-fixup. I can't help but feel this isn't very KISS though. -- msn: stefan_wilkens@hotmail.com e-mail: stefanwilkens@gmail.com blog: http://www.stefanwilkens.eu/ adres: Lipperkerkstraat 14 7511 DA Enschede
On Thu, Oct 29, 2009 at 15:08, Michael Towers <larch42@googlemail.com> wrote:
When I got a new laptop I investigated this problem a little and found that with hdparm -B 254/255 the temperature went up quite significantly. This may be a freak and I would love to know whether there really is something behind it, but when I used -B 200 the temperature increase was clearly smaller, but the load cycle count did not increase!!! Is this actually at all possible? Does the -B option do something other than only affecting head loading? Does anybody know?
In looking through the storage-fixup package data I see that always -B 254 or -B 255 is set, so obviously there is no sign of other, possibly more optimal values there.
Unfortunately there are not many ways to turn off head parking while not turning off power management. For WD drives there is a special DOS binary for that: wdidle3.exe which is provided by the support, but not allowed to be redistributed. You can easily find it on the internet though. Also I have seen some people doing tricks with sdparm, but I have not seen a reliable (non-specific to a particular HDD model) solution yet. -- Roman Kyrylych (Роман Кирилич)
Hey guys, new to the list. Concerning this load_cycle_count issue, we should recall that applying hdparm -B 255 (254) /dev/sdx has more consequences that the user should, at least, be made aware of. As I'm sure you are all aware: completely disabeling the feature will cause increased heat production and power consumption for one, but most importantly: It increases the chance that the drive is damaged if the mobile device is moved during operation! Perhaps this sort of action should be left to the user, and an applet or deamon should be written that monitors the spin cycle count through S.M.A.R.T. and informs the user if it is increasing at an alarming rate in a more graphical or direct way. The user him/herself can then decide what to do. Is storage-fixup's -d an option for this? yes, it's a serious issue and yes the users should be aware. But should the system itself decide to take this action or should we simply inform and let the user decide. I lean towards the latter myself. 2009/10/19 Damjan Georgievski <gdamjan@gmail.com>
Anyone else has some opinion about how to handle this?
I'd like to affirm the opinions of Roman and Xavier and take some action on this.
Anyone object to my putting storage-fixup in [extra] at least? If no objections by W 9/21, I plan to go ahead with that step. If it works out, we can talk about follow-up steps like:
1) moving it to [core] 2) integrating it into default rc.d scripts
It's a pretty serious issue for laptop users with affected drives. And the drives are pretty popular ones, methinks.
Also note that, when raising awerness about this issue, the fixup ussually needs to be run on resume from disk (and I think resume from ram) too. Not only on boot.
-- damjan
-- msn: stefan_wilkens@hotmail.com e-mail: stefanwilkens@gmail.com blog: http://www.stefanwilkens.eu/ adres: Lipperkerkstraat 14 7511 DA Enschede
On Thu, Oct 22, 2009 at 11:39, Stefan Erik Wilkens <stefanwilkens@gmail.com> wrote:
yes, it's a serious issue and yes the users should be aware. But should the system itself decide to take this action or should we simply inform and let the user decide. I lean towards the latter myself.
Absolutely inform the user. Don't do anything automatically for this. Also, please bottom post for these lists.
participants (10)
-
Aaron Griffin
-
Alexander Lam
-
Daenyth Blank
-
Damjan Georgievski
-
Eric Bélanger
-
Michael Towers
-
Ng Oon-Ee
-
Roman Kyrylych
-
Stefan Erik Wilkens
-
Xavier