[arch-general] Hardlock after postfix/smtp entry in log - leaves 4 lost inodes each time - ideas?

Mon Mar 18 09:16:51 EDT 2013

Guys,

  I have a server that will hardlock every week or two. The log entries always
look the same. There is a postfix/smtp transaction in progress when the lock
occurs. After the lockup you are dropped to maintenance mode on next reboot and
there are always 4 inodes that are part of an orphaned link list that are fixed
with fsck and then the machine reboot normally. The log entries just prior to
the lockup look like this:

 Mar 17 16:07:16 phoenix postfix/anvil[26843]: statistics: max connection rate
1/60s for (smtp:213.199.243.30) at Mar 17 16:01:52
Mar 17 16:07:16 phoenix postfix/anvil[26843]: statistics: max connection count 1
for (smtp:213.199.243.30) at Mar 17 16:01:52
Mar 17 16:07:16 phoenix postfix/anvil[26843]: statistics: max cache size 1 at
Mar 17 16:01:52
Mar 17 16:14:52 phoenix postfix/qmgr[1019]: 81963E9720:
from=<inconsiderableka04 at gil.com.au>, size=7485, nrcpt=1 (queue active)
Mar 17 16:14:52 phoenix postfix/smtp[26899]: 81963E9720:
to=<**snipped**@3111skyline.com>, relay=3111skyline.com[66.76.63.120]:25,
delay=1118, delays=1118/0.02/0.16/0.17, dsn=4.7.1, status=deferred (host
3111skyline.com[66.76.63.120] said: 450 4.7.1 Client host rejected: cannot find
your hostname, [66.76.63.60] (in reply to RCPT TO command))
Mar 18 07:34:19 phoenix kernel: [    0.000000] Initializing cgroup subsys cpuset
Mar 18 07:34:19 phoenix kernel: [    0.000000] Initializing cgroup subsys cpu
Mar 18 07:34:19 phoenix kernel: [    0.000000] Linux version 3.4.7-1-ARCH
(tobias at T-POWA-LX) (gcc version 4.7.1 20120721 (prerelease) (GCC) ) #1

  I cannot find any connection between the postfix/smtp and the lockup searching
the web. So I'm asking here, has anyone else seen a lockup where the last log
entry is a postfix/smtp entry and then experienced a 4 orphaned inode error on
reboot?  This has occurred multiple times over the past year or so. memtest
completes without error and the drives show no other errors or issues. Drive
temps are stable at:

/dev/sda: ST3250410AS: 35°C
/dev/sdb: ST3250410AS: 39°C

  Any feedback welcomed. Otherwise, it looks like this has to be hardware.

-- 
David C. Rankin, J.D.,P.E.