[arch-general] System terribly slow

riveravaldez riveravaldezmail at gmail.com
Fri Dec 17 03:33:57 UTC 2021


On Friday, December 3, 2021, Iyán Méndez Veiga via arch-general <
arch-general at lists.archlinux.org> wrote:
> Hi,
>
> On Saturday, 4 December 2021 01:07:18 CET riveravaldez via arch-general
wrote:
>> Hi,
>>
>> I'm looking for some advice trying to pinpoint why or where my system has
>> becoming almost unusable slow. Searching the web I've found some hints
>> but nothing precise enough. Maybe it's a failing HDD, but I'm first
posting
>> what I have until now in the hope someone can give some advice.
>
> Have you check SMART health status of your disks? Maybe also run some
self-
> tests to discard HDD issues. A HDD about to die can make system incredible
> slow.

Hi, Iyán, thanks a lot for the reply and sorry for the delay.
I've run some SMART tests (short and long), this is what I have til now:

$ sudo smartctl -i /dev/sda | grep SMART
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

$ sudo smartctl -c /dev/sda
=== START OF READ SMART DATA SECTION ===
General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      ( 113) The previous self-test completed
having
the read element of the test failed.
Total time to complete Offline
data collection: (  645) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: (  83) minutes.
SCT capabilities:       (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

$ sudo smartctl -H /dev/sda
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

$ sudo smartctl -l selftest /dev/sda
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)
 LBA_of_first_error
# 1  Extended offline    Completed: read failure       10%     52941
  98659719
# 2  Short offline       Completed without error       00%     52932
  -
# 3  Extended offline    Completed: read failure       10%     52887
  78053410
# 4  Short offline       Completed without error       00%     52881
  -
# 5  Extended offline    Completed: read failure       10%     52875
  98659715
# 6  Short offline       Completed without error       00%     52868
  -

$ sudo smartctl -a /dev/sda
=== START OF INFORMATION SECTION ===
Model Family:     Hitachi Travelstar 5K500.B
Device Model:     Hitachi HTS545025B9A300
Serial Number:    091108PB42061SCP1DUL
LU WWN Device Id: 5 000cca 5e8c99119
Firmware Version: PB2OC60N
User Capacity:    250.059.350.016 bytes [250 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    5400 rpm
Form Factor:      2.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 6
SATA Version is:  SATA 2.6, 3.0 Gb/s
Local Time is:    Thu Dec 16 21:59:01 2021 -03
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      ( 113) The previous self-test completed
having
the read element of the test failed.
Total time to complete Offline
data collection: (  645) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: (  83) minutes.
SCT capabilities:       (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED
 WHEN_FAILED RAW_VALUE
 1 Raw_Read_Error_Rate     0x000b   098   098   062    Pre-fail  Always
  -       196609
 2 Throughput_Performance  0x0005   100   100   040    Pre-fail  Offline
   -       0
 3 Spin_Up_Time            0x0007   206   206   033    Pre-fail  Always
  -       1
 4 Start_Stop_Count        0x0012   098   098   000    Old_age   Always
  -       3347
 5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always
  -       0
 7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail  Always
  -       0
 8 Seek_Time_Performance   0x0005   100   100   040    Pre-fail  Offline
   -       0
 9 Power_On_Hours          0x0012   001   001   000    Old_age   Always
  -       53005
10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail  Always
  -       0
12 Power_Cycle_Count       0x0032   098   098   000    Old_age   Always
  -       3213
191 G-Sense_Error_Rate      0x000a   100   100   000    Old_age   Always
    -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always
    -       1114237
193 Load_Cycle_Count        0x0012   001   001   000    Old_age   Always
    -       4508441
194 Temperature_Celsius     0x0002   144   144   000    Old_age   Always
    -       38 (Min/Max 9/50)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always
    -       7
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always
    -       3
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline
   -       0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always
    -       2
223 Load_Retry_Count        0x000a   100   100   000    Old_age   Always
    -       0

SMART Error Log Version: 1
ATA Error Count: 2
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 2 occurred at disk power-on lifetime: 3303 hours (137 days + 15 hours)
 When the command that caused the error occurred, the device was active or
idle.

 After command completion occurred, registers were:
 ER ST SC SN CL CH DH
 -- -- -- -- -- -- --
 40 51 4b dd df 1d e1  Error: UNC 75 sectors at LBA = 0x011ddfdd = 18735069

 Commands leading to the command that caused the error were:
 CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
 -- -- -- -- -- -- -- --  ----------------  --------------------
 25 00 80 a8 df 1d e0 00      00:34:49.600  READ DMA EXT
 ea 00 00 00 00 00 a0 00      00:34:49.600  FLUSH CACHE EXT
 25 00 08 80 cc 1d e0 00      00:34:49.600  READ DMA EXT
 35 00 08 c8 cd 5b e0 00      00:34:49.600  WRITE DMA EXT
 25 00 08 f8 41 28 e0 00      00:34:49.600  READ DMA EXT

Error 1 occurred at disk power-on lifetime: 3303 hours (137 days + 15 hours)
 When the command that caused the error occurred, the device was active or
idle.

 After command completion occurred, registers were:
 ER ST SC SN CL CH DH
 -- -- -- -- -- -- --
 40 51 4b dd df 1d e1  Error: UNC 75 sectors at LBA = 0x011ddfdd = 18735069

 Commands leading to the command that caused the error were:
 CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
 -- -- -- -- -- -- -- --  ----------------  --------------------
 25 00 80 a8 df 1d e0 00      00:34:45.600  READ DMA EXT
 25 00 08 50 a0 20 e0 00      00:34:44.700  READ DMA EXT
 25 00 68 00 df 1d e0 00      00:34:44.700  READ DMA EXT
 25 00 30 80 f9 20 e0 00      00:34:44.700  READ DMA EXT
 25 00 08 20 b6 21 e0 00      00:34:44.700  READ DMA EXT

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)
 LBA_of_first_error
# 1  Extended offline    Completed: read failure       10%     52941
  98659719
# 2  Short offline       Completed without error       00%     52932
  -
# 3  Extended offline    Completed: read failure       10%     52887
  78053410
# 4  Short offline       Completed without error       00%     52881
  -
# 5  Extended offline    Completed: read failure       10%     52875
  98659715
# 6  Short offline       Completed without error       00%     52868
  -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
   1        0        0  Not_testing
   2        0        0  Not_testing
   3        0        0  Not_testing
   4        0        0  Not_testing
   5        0        0  Not_testing
Selective self-test flags (0x0):
 After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

That's it.
I'm not sure if that's more or less normal or a dying disk...
Any comment? Something informative there?

>> Superficial symptoms are a general slowness. I'm using just IceWM and
>> `startx` to initialize the GUI, and even logging into my account
previous to
>> that takes almost a minute to get done. Then any application I launch
takes
>> minutes to just start (from IceWM to firefox, from qtox to pcmanfm or
>> geany, etc.). Even navigate the folders tree with pcmanfm takes 10 or
more
>> seconds just to show any folder content...
>
> Do you have a single disk or more than one?

Single disk.

> Also, to discard other hardware problems, can you boot arch (or any other
> distro) on a USB and check if system is more responsive.

I'll do that next and report.

> Another thing you can check is the CPU freq. If CPU too hot, modern CPUs
will
> throttle a lot. Although if you have a HDD I don't think this is the
case...
> anycase, it's a quick think to check temperature and frequency.

I'm looking for that right now. Any hint or recommendation
about how to better do it?

> Also, did the slowness start after updating any BIOS/firmware?

Not to my knowledge.
Right now, on boot, this is all I have:

$ sudo dmesg
(...)
[  533.325003] nouveau 0000:01:00.0: fifo: INTR 00000001: 00000000
[  533.325022] nouveau 0000:01:00.0: fifo: SCHED_ERROR 00 []
[  533.325029] nouveau 0000:01:00.0: fifo: INTR 00010000: 00000000
[  533.325034] nouveau 0000:01:00.0: fifo: INTR 01000000: 00000000
[  533.325042] nouveau 0000:01:00.0: fifo: INTR 08800010
[  533.325090] nouveau 0000:01:00.0: bus: MMIO read of 00000000 FAULT at
002100 [ !ENGINE ]
(...)

$ sudo journalctl -b -exp3
-- Journal begins at Thu 2021-12-02 23:59:31 -03, ends at Mon 2021-12-13
21:32:34 -03. --
dic 13 20:58:00 arch libvirtd[608]: cannot open directory
'/home/dell/Software/VMs/TrisquelMini8': No existe el fichero o el
directorio
dic 13 20:58:00 arch libvirtd[608]: error interno: Falló al iniciar
automáticamente el grupo de almacenamiento 'TrisquelMini8': cannot open
directory '/home/dell/Softwar>
dic 13 20:58:02 arch libvirtd[608]: No se encontró 'dmidecode' en ruta: No
existe el fichero o el directorio
dic 13 20:58:12 arch libvirtd[608]: No se encontró 'dmidecode' en ruta: No
existe el fichero o el directorio
dic 13 21:05:29 arch kernel: nouveau 0000:01:00.0: fifo: SCHED_ERROR 00 []
dic 13 21:05:29 arch kernel: nouveau 0000:01:00.0: fifo: INTR 08800010
dic 13 21:05:29 arch kernel: nouveau 0000:01:00.0: bus: MMIO read of
00000000 FAULT at 002100 [ !ENGINE ]

> Maybe also run some memtest?

I'll try that also and report.

> Hope it helps.

Me too. Thanks a lot again!


More information about the arch-general mailing list