[pacman-dev] makepkg download fails with special chars in URL (was: [signoff] pacman 3.2.0)
On Thu, Jul 31, 2008 at 8:15 AM, Xavier <shiningxc@gmail.com> wrote:
On Thu, Jul 31, 2008 at 2:50 PM, Allan McRae <allan@archlinux.org> wrote:
Indeed, that is the first problem, netfile should be quoted here. local proto=$(echo "$netfile" | sed 's|://.*||')
And that $netfile should be $url....
Yeah, apparently.
But once this is fixed, we run into a second problem, with another sed command, which breaks because of all the & in the url. local dlcmd=$(echo "$dlagent" | sed "s|%o|$file.part|" | sed "s|%u|$netfile|"
$ netfile="ab&cd&ef"; echo "%u" | sed "s|%u|$netfile|" ab%ucd%uef
$ netfile="ab\&cd\&ef"; echo "%u" | sed "s|%u|$netfile|" ab&cd&ef
So I don't see how to fix this except manually escaping all & in the url...
Well, to just make the bug more obscure.
$ netfile="ab&cd&ef"; echo "%u" | sed "s|%u|$netfile|" | sed "s|\%u|\&|g" ab&cd&ef
But that makes problems if there is a %u in the url. But can we do some multistage hackery like that?
I really don't like this.
I think we could use bash feature instead. echo ${netfile//foo/bar} But even after converting the two above sed rules with this, I still have other problem. It is driving me crazy. When I print the whole download command myself, and copy/paste it to the shell, it works fine. But when it gets executed in makepkg with : $(get_downloadcmd foo bar), it always fail.
Subject change...and can someone open a bug and get this information in there, please? I'd rather track it there before 10 other people complain too. -Dan
Dan McGee wrote:
On Thu, Jul 31, 2008 at 8:15 AM, Xavier <shiningxc@gmail.com> wrote:
On Thu, Jul 31, 2008 at 2:50 PM, Allan McRae <allan@archlinux.org> wrote:
Indeed, that is the first problem, netfile should be quoted here. local proto=$(echo "$netfile" | sed 's|://.*||')
And that $netfile should be $url....
Yeah, apparently.
But once this is fixed, we run into a second problem, with another sed command, which breaks because of all the & in the url. local dlcmd=$(echo "$dlagent" | sed "s|%o|$file.part|" | sed "s|%u|$netfile|"
$ netfile="ab&cd&ef"; echo "%u" | sed "s|%u|$netfile|" ab%ucd%uef
$ netfile="ab\&cd\&ef"; echo "%u" | sed "s|%u|$netfile|" ab&cd&ef
So I don't see how to fix this except manually escaping all & in the url...
Well, to just make the bug more obscure.
$ netfile="ab&cd&ef"; echo "%u" | sed "s|%u|$netfile|" | sed "s|\%u|\&|g" ab&cd&ef
But that makes problems if there is a %u in the url. But can we do some multistage hackery like that?
I really don't like this.
I think we could use bash feature instead. echo ${netfile//foo/bar}
I went for this approach: local netfile=$(echo "$2" | sed "s|\&|\\\&|g")
But even after converting the two above sed rules with this, I still have other problem. It is driving me crazy. When I print the whole download command myself, and copy/paste it to the shell, it works fine. But when it gets executed in makepkg with : $(get_downloadcmd foo bar), it always fail.
I'm stuck there too...
Subject change...and can someone open a bug and get this information in there, please? I'd rather track it there before 10 other people complain too.
Filing bug report now. Allan
Allan McRae wrote:
Dan McGee wrote:
Subject change...and can someone open a bug and get this information in there, please? I'd rather track it there before 10 other people complain too.
Filing bug report now.
This is where I have got up to in searching for the solution: 1) Fix sed statement to extract protocol in get_downloadclient() local proto=$(echo "$url" | sed 's|://.*||') 2) escape ampersands in netfile in get_downloadcmd() local netfile=$(echo "$2" | sed "s|\&|\\\&|g") Then $dlcmd get created correctly but running the following messages at the end of get_downloadcmd() shows what is happening... plain "$dlcmd" /usr/bin/wget -c -t 3 --waitretry=3 -O mythplugins-0.21.tar.bz2.part http://www.mythtv.org/modules.php?name=Downloads&d_op=getit&lid=136&foo=/mythplugins-0.21.tar.bz2 plain "$(echo $dlcmd)" /usr/bin/wget -c -t 3 --waitretry=3 -O mythplugins-0.21.tar.bz2.part So this does not pass the url to the actual wget call, which explains the "wget: missing URL" error message I am left with. Note that once that is fixed, the ampersands in the url need escaped again for the wget command to actually work. I need sleep so I leave it to people in a more appropriate time zones to find a solution. :) Allan
On Thu, Jul 31, 2008 at 5:27 PM, Allan McRae <allan@archlinux.org> wrote:
This is where I have got up to in searching for the solution:
1) Fix sed statement to extract protocol in get_downloadclient() local proto=$(echo "$url" | sed 's|://.*||')
Fine
2) escape ampersands in netfile in get_downloadcmd() local netfile=$(echo "$2" | sed "s|\&|\\\&|g")
I wanted to try using bash substitution instead of sed for replacing %u and %o dlcmd=${dlagent//%o/$file.part} dlcmd=${dlcmd//%u/$netfile} But then, netfile still needs to be quoted, so dlcmd=${dlagent//%o/$file.part} dlcmd=${dlcmd//%u/\"$netfile\"} And just doing this breaks the whole vim syntax highlighting, which is very very annoying.
Then $dlcmd get created correctly but running the following messages at the end of get_downloadcmd() shows what is happening...
plain "$dlcmd" /usr/bin/wget -c -t 3 --waitretry=3 -O mythplugins-0.21.tar.bz2.part http://www.mythtv.org/modules.php?name=Downloads&d_op=getit&lid=136&foo=/mythplugins-0.21.tar.bz2
plain "$(echo $dlcmd)" /usr/bin/wget -c -t 3 --waitretry=3 -O mythplugins-0.21.tar.bz2.part
So this does not pass the url to the actual wget call, which explains the "wget: missing URL" error message I am left with. Note that once that is fixed, the ampersands in the url need escaped again for the wget command to actually work.
I need sleep so I leave it to people in a more appropriate time zones to find a solution. :)
I got stuck on exactly the same issue, I just spent at least one hour on it for nothing, it really pissed me off. Now I need sleep so I leave it to people in a more appropriate time zones to find a solution. :) (hello Dan :D) More seriously, I don't consider this a huge issue. The biggest issue here in my opinion is these fucked up urls, I really hate them. I can't stand urls that are not easy to read, write and remember, and that cause a ton of stupid issues like this.
On Thu, 31 Jul 2008 21:53:44 +0200 Xavier <shiningxc@gmail.com> wrote:
On Thu, Jul 31, 2008 at 5:27 PM, Allan McRae <allan@archlinux.org> wrote:
Then $dlcmd get created correctly but running the following messages at the end of get_downloadcmd() shows what is happening...
plain "$dlcmd" /usr/bin/wget -c -t 3 --waitretry=3 -O mythplugins-0.21.tar.bz2.part http://www.mythtv.org/modules.php?name=Downloads&d_op=getit&lid=136&foo=/mythplugins-0.21.tar.bz2
plain "$(echo $dlcmd)" /usr/bin/wget -c -t 3 --waitretry=3 -O mythplugins-0.21.tar.bz2.part
So this does not pass the url to the actual wget call, which explains the "wget: missing URL" error message I am left with. Note that once that is fixed, the ampersands in the url need escaped again for the wget command to actually work.
I need sleep so I leave it to people in a more appropriate time zones to find a solution. :)
I got stuck on exactly the same issue, I just spent at least one hour on it for nothing, it really pissed me off. Now I need sleep so I leave it to people in a more appropriate time zones to find a solution. :) (hello Dan :D)
I think I found the reason for this, it's not the ampersands, it's the question mark. It looks like bash tries to do filename expansion on the url, this worked in the old version because no file was found and thus the url not replaced. But in the new version nullglob is set, so the url is removed. There are several ways to fix this behaviour: 1. Disable nullglob and reenable afterwards 2. Enable nullglob only when it is really needed 3. Disable filename expansion and reenable afterwards I think 1. is a bit weird and I don't know the code good enough to say something about 2., so I lean heavily to 3. .
2) escape ampersands in netfile in get_downloadcmd() local netfile=$(echo "$2" | sed "s|\&|\\\&|g")
I wanted to try using bash substitution instead of sed for replacing %u and %o dlcmd=${dlagent//%o/$file.part} dlcmd=${dlcmd//%u/$netfile}
But then, netfile still needs to be quoted, so dlcmd=${dlagent//%o/$file.part} dlcmd=${dlcmd//%u/\"$netfile\"}
And just doing this breaks the whole vim syntax highlighting, which is very very annoying.
From my point of view this would be a good reason to stick with escaping. I thought about some other way to do this without sed, but I couldn't come up with anything useful :( .
I can't stand urls that are not easy to read, write and remember, and that cause a ton of stupid issues like this.
+1
participants (4)
-
Allan McRae
-
Dan McGee
-
Henning Garus
-
Xavier