[arch-general] linux 3.1-4 - two i686 lockups after ~ 5 hours of operations. two x86_64 seem OK

Leonid Isaev lisaev at umail.iu.edu
Thu Nov 10 15:47:39 EST 2011


On (11/10/11 14:19), David C. Rankin wrote:
-~> On 11/10/2011 01:28 PM, C Anthony Risinger wrote:
-~> >On Thu, Nov 10, 2011 at 1:16 PM, David C. Rankin
-~> ><drankinatty at suddenlinkmail.com>  wrote:
-~> >>
-~> >>  Richard, David - check your hardware clock "# hwclock -r" and compare that
-~> >>to the time returned by "# date". If they are hours apart, then make sure
-~> >>your sysclock is correct and set the hardware clock to your sysclock with "#
-~> >>hwclock -w". Worth checking regardless.  I know this used to be done on boot
-~> >>or shutdown and I don't know why it isn't anymore. I'll do some more
-~> >>digging.
-~> >
-~> >your machine reboots because of a drifting clock?  i don't understand.
-~> >
-~> >aren't you running ntpd (not openntpd)?<---- *HINT* *HINT*, if not ;-)
-~> >
-~> 
-~> Yes, I'm running ntpd and yest I'm saying that my box reboots due
-~> to clock drift. Check out this bizarre log entry. Yes, this is the
-~> actual order of the log:
-~> 
-~> Nov 10 05:12:41 providence kernel: [    1.649918] rtc_cmos 00:05:
-~> setting system clock to 2011-11-10 11:12:27 UTC (1320923547)
-~> 
-~> <snip>
-~> Nov 10 05:12:55 providence ntpd[829]: ntpd 4.2.6p4 at 1.2324-o Sun
-~> Nov  6 05:50:06 UTC 2011 (1)
-~> Nov 10 05:12:56 providence ntpd[864]: proto: precision = 0.832 usec
-~> Nov 10 05:12:56 providence kernel: [   30.360065] NET: Registered protocol family 10
-~> Nov 10 05:12:56 providence ntpd[864]: ntp_io: estimated max
-~> descriptors: 1024, initial socket boundary: 16
-~> Nov 10 05:12:56 providence ntpd[864]: Listen and drop on 0
-~> v4wildcard 0.0.0.0 UDP 123
-~> Nov 10 05:12:56 providence ntpd[864]: Listen and drop on 1 v6wildcard :: UDP 123
-~> Nov 10 05:12:56 providence ntpd[864]: Listen normally on 2 lo 127.0.0.1 UDP 123
-~> Nov 10 05:12:56 providence ntpd[864]: Listen normally on 3 eth0
-~> 192.168.7.124 UDP 123
-~> Nov 10 05:12:56 providence ntpd[864]: Listen normally on 4 lo ::1 UDP 123
-~> Nov 10 05:12:56 providence ntpd[864]: peers refreshed
-~> Nov 10 05:12:56 providence ntpd[864]: Listening on routing socket
-~> on fd #21 for interface updates
-~> Nov 10 05:12:57 providence apcupsd[867]: apcupsd 3.14.10 (13
-~> September 2011) unknown startup succeeded
-~> Nov 10 05:12:57 providence apcupsd[867]: NIS server startup succeeded
-~> Nov 10 05:12:58 providence ntpd[864]: Listen normally on 5 eth0
-~> fe80::211:43ff:fe22:5008 UDP 123
-~> Nov 10 05:12:58 providence ntpd[864]: peers refreshed
-~> Nov 10 05:12:58 providence ntpd[864]: new interface(s) found: waking up resolver
-~> 
-~> <snip>
-~> Nov 10 05:14:02 providence dbus[717]: [system] Successfully
-~> activated service 'org.freedesktop.PolicyKit1'
-~> Nov 10 05:14:02 providence dbus[717]: [system] Successfully
-~> activated service 'org.freedesktop.ConsoleKit'
-~> Nov  9 15:29:01 providence crond[859]: time disparity of -827 minutes detected
-~> Nov  9 15:32:24 providence crond[19989]: mailing cron output for
-~> user root job sys-daily
-~> 
-~> Huh?? The system jumped backwards? Whatever is causing this to
-~> occur is causing the spontaneous reboot. Taking a linux system
-~> forward in time is OK, but taking it backwards in time really
-~> really causes things to go haywire. The hwclock doesn't seem to
-~> drift that much, so I don't know what the issue is. I set the
-~> thing about 3 hours ago and there is no drift:
-~> 
-~> [14:16 providence:/home/david/tmp] # hwclock -r; date
-~> Thu 10 Nov 2011 02:17:44 PM CST  -0.125494 seconds
-~> Thu Nov 10 14:17:44 CST 2011
-~> 
-~> Something is up though, but I can't explain it.
-~> 
-~> -- 
-~> David C. Rankin, J.D.,P.E.

OK. On top of my head I would suggest:
1. Play with clocksource (see kernel-parameters.txt).
2. Add "-ddd" to /etc/conf.d/ntpd.conf's NTPD_ARGS variable.
3. See this http://twiki.ntp.org/bin/view/Support/KnownHardwareIssues (might
need to disable ntpd).
4. Try community/chrony.

-- 
Leonid Isaev
GnuPG key ID: 164B5A6D
Key fingerprint: C0DF 20D0 C075 C3F1 E1BE  775A A7AE F6CB 164B 5A6D
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 490 bytes
Desc: not available
URL: <http://mailman.archlinux.org/pipermail/arch-general/attachments/20111110/38d117ce/attachment.asc>


More information about the arch-general mailing list