[arch-dev-public] Changing raid/raid-partitions initcpio hooks
Hi guys, while experimenting with different raid setups in kvm, i found out our raid support is not really ideal. The status now: - raid hook(inlcuded in mkinitcpio package) does assemble normal raid: but if a drive fails it will fail to boot after it. i'm not 100% sure if it can handle all types of raid levels, partitionable mdp raid doesn't work at all with this hook for sure. - raid-partitions(included in mdadm package) can only assemble 1 raid partition array, which is imho bad you should be able to assemble more than one raid partition device from commandline, if you wish to do so. - UUID is not supported in any of both. Suggestion: Shouldn't we replace all this with 1 hook which can handle all these cases? Now comes the point, to achive this, mdadm will be needed in initramfs which means having a 900k static binary in early boot sequence. (raid-partitions hook already uses this binary) The trick i thought of, would be to generate dynamically a mdadm.conf file from boot commandline, which will use the old syntax of assembling + adding a new syntax for UUID support. Sample code is uploaded here: http://www.archlinux.org/~tpowa/mdadm.hook (not yet tested to boot a system, it's sample code!) Problems: Should this replace an existing hook or a be brand new hook? What should happen to the other hooks, in order to not break a user setup. Doing it with a NEWS Item and a installation message should be fine imho. What do you guys think of this? Thanks for your input. greetings tpowa -- Tobias Powalowski Archlinux Developer & Package Maintainer (tpowa) http://www.archlinux.org tpowa@archlinux.org
On Wed, Feb 18, 2009 at 3:40 PM, Tobias Powalowski <t.powa@gmx.de> wrote:
Hi guys, while experimenting with different raid setups in kvm, i found out our raid support is not really ideal.
The status now: - raid hook(inlcuded in mkinitcpio package) does assemble normal raid: but if a drive fails it will fail to boot after it. i'm not 100% sure if it can handle all types of raid levels, partitionable mdp raid doesn't work at all with this hook for sure. - raid-partitions(included in mdadm package) can only assemble 1 raid partition array, which is imho bad you should be able to assemble more than one raid partition device from commandline, if you wish to do so. - UUID is not supported in any of both.
Suggestion: Shouldn't we replace all this with 1 hook which can handle all these cases? Now comes the point, to achive this, mdadm will be needed in initramfs which means having a 900k static binary in early boot sequence. (raid-partitions hook already uses this binary) The trick i thought of, would be to generate dynamically a mdadm.conf file from boot commandline, which will use the old syntax of assembling + adding a new syntax for UUID support. Sample code is uploaded here: http://www.archlinux.org/~tpowa/mdadm.hook (not yet tested to boot a system, it's sample code!)
Problems: Should this replace an existing hook or a be brand new hook? What should happen to the other hooks, in order to not break a user setup. Doing it with a NEWS Item and a installation message should be fine imho.
What do you guys think of this? Thanks for your input.
Haven't thought too hard on this, but a couple of points: klibc should have basic raid assembly functionality built it. I was under the impression it worked for basic cases. If this is still true then we shouldn't replace it, but make sure people are aware the klibc one is real basic and the mdadm one is more indepth. The sample code with all those nested calls is real ugly and hard to read - readability is always preferred if other people are going to be looking at / fixing / working with the code. It'd be nice to break that into more lines so that it's easier to understand Other than that, the idea seems sound. I do not have a raid system to play with, though
Haven't thought too hard on this, but a couple of points:
klibc should have basic raid assembly functionality built it. I was under the impression it worked for basic cases. If this is still true then we shouldn't replace it, but make sure people are aware the klibc one is real basic and the mdadm one is more indepth. One real problem with basic raid hook i experienced is, if one device fails and you reboot, it will break during mouting root fs. mdassemble from kinit doesn't like missing devices, you need to remove them from boot command line. mdadm is really more tolerable on this.
The sample code with all those nested calls is real ugly and hard to read - readability is always preferred if other people are going to be looking at / fixing / working with the code. It'd be nice to break that into more lines so that it's easier to understand Okay i try to make it more clean, problem here is the restricted environment, You need to crack down every md=0,/dev/xyz,/dev/abc to ARRAY /dev/md0 devices=/dev/xyz,/dev/abc I think replace is the only program we have there to do it.
Other than that, the idea seems sound. I do not have a raid system to play with, though
-- Tobias Powalowski Archlinux Developer & Package Maintainer (tpowa) http://www.archlinux.org tpowa@archlinux.org
On Wed, Feb 18, 2009 at 3:55 PM, Tobias Powalowski <t.powa@gmx.de> wrote:
Haven't thought too hard on this, but a couple of points:
klibc should have basic raid assembly functionality built it. I was under the impression it worked for basic cases. If this is still true then we shouldn't replace it, but make sure people are aware the klibc one is real basic and the mdadm one is more indepth. One real problem with basic raid hook i experienced is, if one device fails and you reboot, it will break during mouting root fs. mdassemble from kinit doesn't like missing devices, you need to remove them from boot command line. mdadm is really more tolerable on this.
The sample code with all those nested calls is real ugly and hard to read - readability is always preferred if other people are going to be looking at / fixing / working with the code. It'd be nice to break that into more lines so that it's easier to understand Okay i try to make it more clean, problem here is the restricted environment, You need to crack down every md=0,/dev/xyz,/dev/abc to ARRAY /dev/md0 devices=/dev/xyz,/dev/abc I think replace is the only program we have there to do it.
Yeah, I just meant something like devices=$(replace ......) foo=$(replace) bar=$(replace) echo "ARRAY $foo $bar $devices" or whatever would be far easier to read
Tobias Powalowski schrieb:
Now comes the point, to achive this, mdadm will be needed in initramfs which means having a 900k static binary in early boot sequence. (raid-partitions hook already uses this binary)
I am really annoyed by the fact that almost nothing builds against klibc. Wouldn't it be easier to switch to uclibc and busybox? Both are much more powerful and we wouldn't need static binaries for lvm, cryptsetup or mdadm. We would also have much less problems with all the ABI changes that happen with klibc. I thought we needed a complete toolchain for that, but Jan claimed that it is possible to use our normal gcc and binutils for that.
On Wed, Feb 18, 2009 at 7:38 PM, Thomas Bächler <thomas@archlinux.org> wrote:
Tobias Powalowski schrieb:
Now comes the point, to achive this, mdadm will be needed in initramfs which means having a 900k static binary in early boot sequence. (raid-partitions hook already uses this binary)
I am really annoyed by the fact that almost nothing builds against klibc.
Wouldn't it be easier to switch to uclibc and busybox? Both are much more powerful and we wouldn't need static binaries for lvm, cryptsetup or mdadm. We would also have much less problems with all the ABI changes that happen with klibc.
I thought we needed a complete toolchain for that, but Jan claimed that it is possible to use our normal gcc and binutils for that.
I would be in support of this, as I have seen some of the headaches you have had to deal with when it comes to klibc. Assuming the initrd images wouldn't be insanely sized, I don't think this is a problem. The only real point of klibc is that it is much much smaller than a full-blown glibc, correct? -Dan
On Wed, Feb 18, 2009 at 8:42 PM, Dan McGee <dpmcgee@gmail.com> wrote:
On Wed, Feb 18, 2009 at 7:38 PM, Thomas Bächler <thomas@archlinux.org> wrote:
Tobias Powalowski schrieb:
Now comes the point, to achive this, mdadm will be needed in initramfs which means having a 900k static binary in early boot sequence. (raid-partitions hook already uses this binary)
I am really annoyed by the fact that almost nothing builds against klibc.
Wouldn't it be easier to switch to uclibc and busybox? Both are much more powerful and we wouldn't need static binaries for lvm, cryptsetup or mdadm. We would also have much less problems with all the ABI changes that happen with klibc.
I thought we needed a complete toolchain for that, but Jan claimed that it is possible to use our normal gcc and binutils for that.
I would be in support of this, as I have seen some of the headaches you have had to deal with when it comes to klibc. Assuming the initrd images wouldn't be insanely sized, I don't think this is a problem.
The only real point of klibc is that it is much much smaller than a full-blown glibc, correct?
I'd be for it too, but I'd like to actually SEE an implementation before we commit to anything. This will increase our initramfs size, but it will also give us more functionality. I don't imagine much would need changing - just some of the tools used in the base mkinitcpio hook... and maybe removal or recompilation of much of the klibc-extras utilities. At the very least, we could stick a uclibc package in extra so that we could begin playing with this, right?
Aaron Griffin schrieb:
I'd be for it too, but I'd like to actually SEE an implementation before we commit to anything.
I am not planning to commit to anything here, just seeing what the general opinion is.
This will increase our initramfs size, but it will also give us more functionality.
Okay, klibc takes 72KB, uClibc about 500KB (this is a copy built with their buildroot, including the ld, libcrypt, libdl and so on) and I think we can manually remove features we don't need. We will decrease the size for lvm, raid and cryptsetup users, and we could also dump udev and replace it with busybox's mdev I guess.
I don't imagine much would need changing - just some of the tools used in the base mkinitcpio hook... and maybe removal or recompilation of much of the klibc-extras utilities.
Some of the scripts would be easier to write as some of the standard tools are available (unlike with klibc).
At the very least, we could stick a uclibc package in extra so that we could begin playing with this, right?
We don't have to stick it in extra to play with it. I don't know about gcc spec files and such, maybe Jan can provide me with the links so I can look at this trick he mentioned.
On Thu, 2009-02-19 at 02:38 +0100, Thomas Bächler wrote:
I thought we needed a complete toolchain for that, but Jan claimed that it is possible to use our normal gcc and binutils for that.
I'm using an unofficial debian package for uclibc. There's a package called uclibc-toolchain, which contains a script to munge the gcc spec file. After this script is used to munge the gcc spec file (we could alter the gcc specfile by default in the gcc package btw), there's a new flag for gcc: -uclibc. Whenever you compile something with this option, includefiles are diverted to /usr/i486-linux-uclibc/include and libraries are linked against uclibc instead of glibc. The only problem we have to solve is getting shared libraries like the ones from devicemapper installed in a way that it doesn't conflict with the ones that link against glibc.
Am Donnerstag 19 Februar 2009 schrieb Jan de Groot:
On Thu, 2009-02-19 at 02:38 +0100, Thomas Bächler wrote:
I thought we needed a complete toolchain for that, but Jan claimed that it is possible to use our normal gcc and binutils for that.
I'm using an unofficial debian package for uclibc. There's a package called uclibc-toolchain, which contains a script to munge the gcc spec file. After this script is used to munge the gcc spec file (we could alter the gcc specfile by default in the gcc package btw), there's a new flag for gcc: -uclibc. Whenever you compile something with this option, includefiles are diverted to /usr/i486-linux-uclibc/include and libraries are linked against uclibc instead of glibc. The only problem we have to solve is getting shared libraries like the ones from devicemapper installed in a way that it doesn't conflict with the ones that link against glibc. Keepn in mind mdadm willl not work with uclib for x86_64, it's stated in the Makefile.
greetings tpowa -- Tobias Powalowski Archlinux Developer & Package Maintainer (tpowa) http://www.archlinux.org tpowa@archlinux.org
Hi http://bugs.archlinux.org/task/10651 http://bugs.archlinux.org/task/9122 both states raid is broken for more complex setups. According to my earlier mail about assembling raid arrays i wrote a new hook and install file. It uses mdassemble.static from mdadm tarball which is smaller than mdadm.static and does the assembling job and loading of the needed raid module. How does it work: - If a an array is defined on users system /etc/mdadm.conf, the file will be added to initramfs and used for the assembling things. - If no array was defined it falls back to commandline assembling and a madm.conf file will be created on the fly during bootup. Old command line syntax wasn't changed, uuid support is added (eg. md=0,0900878d:f95f6057:c39a36e9:55efa61b) thoughts on it? built x86_64 package and other files are here: http://dev.archlinux.org/~tpowa/mdadm/ Attention: Tested normal raid 1 setup with and without mdadm.conf modified. Not tested yet uuid assembling from commandline and partitionable raid! greetings tpowa -- Tobias Powalowski Archlinux Developer & Package Maintainer (tpowa) http://www.archlinux.org tpowa@archlinux.org
On Mon, Mar 2, 2009 at 4:33 PM, Tobias Powalowski <t.powa@gmx.de> wrote:
Hi http://bugs.archlinux.org/task/10651 http://bugs.archlinux.org/task/9122 both states raid is broken for more complex setups.
According to my earlier mail about assembling raid arrays i wrote a new hook and install file. It uses mdassemble.static from mdadm tarball which is smaller than mdadm.static and does the assembling job and loading of the needed raid module.
How does it work: - If a an array is defined on users system /etc/mdadm.conf, the file will be added to initramfs and used for the assembling things. - If no array was defined it falls back to commandline assembling and a madm.conf file will be created on the fly during bootup. Old command line syntax wasn't changed, uuid support is added (eg. md=0,0900878d:f95f6057:c39a36e9:55efa61b)
thoughts on it? built x86_64 package and other files are here: http://dev.archlinux.org/~tpowa/mdadm/ Attention: Tested normal raid 1 setup with and without mdadm.conf modified. Not tested yet uuid assembling from commandline and partitionable raid!
I just looked at the initramfs stuff and it looks good. Maybe I'll take one of my unused laptops and make a bunk raid array on it for testing. :)
participants (5)
-
Aaron Griffin
-
Dan McGee
-
Jan de Groot
-
Thomas Bächler
-
Tobias Powalowski