[arch-general] On-boot delay due to timer units
Hi, Since anacron jobs were replaced with timers, I am seeing a noticeable delay before agetty prompt appears on machines which were unused for some time (due to update/man-db timers starting up simultaneously). TLDR: Anacron inserts a random delay between boot and running the jobs, so is it possible to simulate this behavior by including e.g. "OnBootSec=..." in the timers at next update? Or is this option incompatible with OnCalendar? Here is the (edited) "statistics" obtained by grepping /var/log/daemon.log. The disk is actually an Intel X-25 (sata-2) SSD. --- No timers are active (baseline) --- Apr 6: 5.983s (kernel) + 1.947s (userspace) = 7.930s. Apr 6: 5.815s (kernel) + 2.494s (userspace) = 8.310s. Apr 6: 5.692s (kernel) + 1.612s (userspace) = 7.304s. Apr 7: 5.874s (kernel) + 2.561s (userspace) = 8.436s. Apr 9: 5.704s (kernel) + 3.001s (userspace) = 8.706s. Apr 10: 5.612s (kernel) + 2.494s (userspace) = 8.106s. Apr 11: 5.618s (kernel) + 2.908s (userspace) = 8.526s. Apr 12: 5.671s (kernel) + 3.345s (userspace) = 9.016s. --- Timers first run --- Apr 14: 5.464s (kernel) + 46.883s (userspace) = 52.348s. --- Startup with timers --- Apr 15: 5.715s (kernel) + 2.878s (userspace) = 8.593s. Apr 16: Not powered on Apr 17: 6.414s (kernel) + 7.785s (userspace) = 14.200s. $ systemd-analyze blame | head 6.724s man-db.service 1.935s updatedb.service 926ms root-ssh-key-init@0x14d33aba.service 507ms lxc@appserver\x2dx86_64.service 427ms rfkill-unblock@wlan.service 381ms systemd-networkd.service 340ms wlan-powersave@wls1.service 289ms syslog-ng.service 235ms volatile-mail.service 225ms iptables.service Thanks, L. -- Leonid Isaev GPG fingerprints: DA92 034D B4A8 EC51 7EA6 20DF 9291 EE8A 043C B8C4 C0DF 20D0 C075 C3F1 E1BE 775A A7AE F6CB 164B 5A6D
Am 17.04.2014 20:56, schrieb Leonid Isaev:
Hi,
Since anacron jobs were replaced with timers, I am seeing a noticeable delay before agetty prompt appears on machines which were unused for some time (due to update/man-db timers starting up simultaneously).
TLDR: Anacron inserts a random delay between boot and running the jobs, so is it possible to simulate this behavior by including e.g. "OnBootSec=..." in the timers at next update? Or is this option incompatible with OnCalendar?
OnBootSec would cause the timers to always run on boot, no matter how much time has passed, which is not what we want. I don't think it is a problem that the timers run on boot, but rather that they delay Type=idle units, like agetty. From what the documentation says, there should not be any delay: "Behavior of idle is very similar to simple; however, actual execution of the service binary is delayed until all jobs are dispatched." I am confused why get a delay here. I think another solution in systemd would be introducing a holdoff time: Instead of running immediately on boot, the timer should be scheduled for boot+5min. This requires some investigation - sorry, I don't have a quick solution right now.
On Thu, 17 Apr 2014 21:31:07 +0200 Thomas Bächler <thomas@archlinux.org> wrote:
Am 17.04.2014 20:56, schrieb Leonid Isaev:
Hi,
Since anacron jobs were replaced with timers, I am seeing a noticeable delay before agetty prompt appears on machines which were unused for some time (due to update/man-db timers starting up simultaneously).
TLDR: Anacron inserts a random delay between boot and running the jobs, so is it possible to simulate this behavior by including e.g. "OnBootSec=..." in the timers at next update? Or is this option incompatible with OnCalendar?
OnBootSec would cause the timers to always run on boot, no matter how much time has passed, which is not what we want.
OK.
I don't think it is a problem that the timers run on boot, but rather that they delay Type=idle units, like agetty. From what the documentation says, there should not be any delay:
"Behavior of idle is very similar to simple; however, actual execution of the service binary is delayed until all jobs are dispatched."
I am confused why get a delay here.
I think the problem is the disk I/O generated due to e.g. man-db indexing, because I see the hdd light is solid on. So, my guess is that two things can happen: either the login prompt is delayed, or the prompt is shown but the actual login will stall.
I think another solution in systemd would be introducing a holdoff time: Instead of running immediately on boot, the timer should be scheduled for boot+5min.
You are right -- that's the best way to put it. Except, I'd generate random timeouts (distributed in some interval) for the corresponding services... Thanks, L. -- Leonid Isaev GPG fingerprints: DA92 034D B4A8 EC51 7EA6 20DF 9291 EE8A 043C B8C4 C0DF 20D0 C075 C3F1 E1BE 775A A7AE F6CB 164B 5A6D
On Thu, Apr 17, 2014 at 9:31 PM, Thomas Bächler <thomas@archlinux.org> wrote:
I don't think it is a problem that the timers run on boot, but rather that they delay Type=idle units, like agetty. From what the documentation says, there should not be any delay:
"Behavior of idle is very similar to simple; however, actual execution of the service binary is delayed until all jobs are dispatched."
I am confused why get a delay here.
When the timer fires it adds a start job to the manager. Type=idle services wait for the manager job list (not the transaction job list) to empty. Maybe Type=idle should be changed to trigger when its transaction completes.
On Thu, 17 Apr 2014 21:31:07 +0200 Thomas Bächler <thomas@archlinux.org> wrote:
[...]
I think another solution in systemd would be introducing a holdoff time: Instead of running immediately on boot, the timer should be scheduled for boot+5min.
This requires some investigation - sorry, I don't have a quick solution right now.
AFAIU, there are 2 real issues here: 1. We hook to the boot process a bunch of disk-intensive operations which did not belong there in the 1st place. 2. Even if a boot delay for timers is implemented or the behavior of Type=idle units is "fixed" somehow in systemd, still all "cron" timers will be started in parallel which may result in a slow/unresponsive system. Note, that under anacron they were serialized by run-parts. BTW, thanks for bringing this up on systemd ML. L. -- Leonid Isaev GPG fingerprints: DA92 034D B4A8 EC51 7EA6 20DF 9291 EE8A 043C B8C4 C0DF 20D0 C075 C3F1 E1BE 775A A7AE F6CB 164B 5A6D
Am 21.04.2014 18:56, schrieb Leonid Isaev:
On Thu, 17 Apr 2014 21:31:07 +0200 AFAIU, there are 2 real issues here: 1. We hook to the boot process a bunch of disk-intensive operations which did not belong there in the 1st place. 2. Even if a boot delay for timers is implemented or the behavior of Type=idle units is "fixed" somehow in systemd, still all "cron" timers will be started in parallel which may result in a slow/unresponsive system. Note, that under anacron they were serialized by run-parts.
The Linux scheduler and I/O scheduler are supposed to handle such workloads gracefully. The units are configured to be in the proper io scheduling class and with proper nice values.
* Thomas Bächler <thomas@archlinux.org> [2014-04-17 21:31] :
I think another solution in systemd would be introducing a holdoff time: Instead of running immediately on boot, the timer should be scheduled for boot+5min.
This requires some investigation - sorry, I don't have a quick solution right now.
Hi, I'm experiencing the same problem caused by updatedb launching on boot. This fixes it until the desired feature is added to systemd.timer: ## /etc/systemd/system/updatedb.service.d/delay.conf [Service] # Trick to avoid launching updatedb when the system is booted. ExecStartPre=-/usr/bin/bash -c '[ $(cut -d. -f1 /proc/uptime) -lt 120 ] && sleep 120' ## EOF If the system is up since less than 120 seconds, waits 120 seconds. Also, failure does not prevent running ExecStart= given the "-" before the command line. Best regards, -- Alexandre de Verteuil <claudelepoisson@gmail.com> public key ID : 0xDD237C00 http://alexandre.deverteuil.net/
participants (4)
-
Alexandre de Verteuil
-
Jan Alexander Steffens
-
Leonid Isaev
-
Thomas Bächler