On Sun, Sep 09, 2018 at 08:53:07PM -0400, Andrew Crerar wrote:
Hi all,
On 2018-07-28 there were some discussions in #archlinux-devops around setting up some sort of centralized logging/monitoring/alerting solution for the various services on Apollo (and maybe other?) server(s). I had mentioned possibly using the ELK[1] stack for this task. There was some back and forth about it potentially being a bit heavy handed for what was needed and how we would most likely need to repurpose/dedicate something like nymeria to handle the stack. There was also the suggestion of possibly using something like tenshi[2] if we're aiming for a low overhead solution, however, there would be much writing of the regexes.
With that being said, the purpose of this email is to have a more formal discussion around what we're trying to capture from the logs, the actions we want to have taken with what ends up being captured, and possibly come to a consensus on what tool(s) we could leverage.
Thoughts?
Regards,
Andrew
[1] https://www.elastic.co/de/elk-stack [2]https://github.com/inversepath/tenshi
Hi Andrew, A whole ELK Stack is pretty big. How much servers do we have? An ELK Stack is not only big, it's also a lot of pain to keep it up to date, because you have to update and manage every component of it. I saw many companies switching to graylog instead, but I still think that even graylog is too overkill. If you just want 'log gathering', there are small solutions for this. For example `systemd-journald-gateway`. Did somebody try this before? Besides all of this: If you want to take care of it and take the full responsibility for it. It's a little bit overkill and you increase the attack surface with it, but why not? I am not part of the DevOps Team. So don't take my opinion as something official, I just reply because I got ask for an opinion. chris / shibumi