On Sun, Oct 13, 2013 at 7:55 PM, Allan McRae <allan@archlinux.org> wrote:
I am going to merge all these patches apart from this one and the final patch. If a consensus can be found on how to deal with this issue, I will pull it in - I am not familiar enough with python issues to make the decision myself.
Thanks, Allan. I'm gratified that I can help make (some small) improvements. Sorry I got delayed, but I said I would explain how the Python 2 string gotchas impact the pacman testing framework. I think I found a way to shorten this from what I had anticipated, so hopefully it won't be completely boring... There are two pmtests with non-7bit-ascii chars: remove071 and sync600. remove071 creates one pmpkg (p2) and adds it to the "local" pmdb. sync600 copy-n-pastes that same p2 pmpkg setup, but also creates and adds sp2 to the "sync" pmdb. The framework does very different things for the pmdbs: "local" stuff get written to the filesystem (simulating in Python code what pacman would do to install), while "sync" stuff get written to a tarfile (for later processing by the pacman binary being tested). That is the key difference and stumbling block (and also why this can't be dealt with in sync600). Python's filesystem write API gracefully handles strings of all sorts, automatically converting char-to-byte as needed, so the "local" pmpkg p2 (in both pmtests) works great, but... The tarfile.addfile API requires a fileobject, so the caller of that API is responsible for handling the low-level char-to-byte conversion. Python 2.7's StringIO meets that need. But in 3.x there aren't just fileobjects, there's RawIOBase (the parent class for BytesIO) which reads and writes bytes, and TextIOBase (the parent class for StringIO) which reads and writes chars. tarfile.addfile writes bytes, so in 3.x it fails when it tries to read bytes from a TextIOBase. So how do we feed tarfile.addfile what it wants without special-casing for the Python runtime version? Rather than typing up a long explanation of why there is no way to meet that goal I've attached a Python script that tries all the options I could think of and produces a nice printout of the reason for failure in each case. The last line of the printout lists the successful options - those that work for that particular Python runtime. Running it on 2.7 and 3.3 shows no single option is successful for both. The attached script covers what Martin suggested (assuming I haven't misunderstood what he meant). And if anyone can think of an option that I didn't please post a reply - I love learning new things. Jeremy