[arch-general] [arch-dev-public] AUR ToS (aka making AUR user names public)

Henrik Danielsson h.danielsson at gmail.com
Mon Mar 6 13:46:20 UTC 2017


2017-03-06 14:36 GMT+01:00 Mauro Santos via arch-general
<arch-general at archlinux.org>:
> On 06-03-2017 12:45, Henrik Danielsson via arch-general wrote:
>> 2017-03-06 12:53 GMT+01:00 Mauro Santos via arch-general <
>> arch-general at archlinux.org>:
>>
>>> On 06-03-2017 11:20, Henrik Danielsson via arch-general wrote:
>>>> 2017-03-06 11:18 GMT+01:00 Ralf Mardorf <silver.bullet at zoho.com>:
>>>>>
>>>>> Privacy is a principle. You seem not to understand the difference
>>>>> between giving somebody data with the formal permission to use this data
>>>>> and data that simply is available for everybody, but not explicitly
>>>>> handed over to somebody. Paranoia isn't involved in my concern.
>>>>>
>>>> My standpoint is that privacy does not apply to this kind of public
>>>> information, simply because it's not private and by no means sensitive
>>>> (people freely chose the username and other visible info they posted,
>>> no?).
>>>> Thus, no, I see no difference and really no point in even considering
>>>> trying to keep such information private.
>>>>
>>>> What anyone does with the freely available information posted in the AUR
>>> is
>>>> up to them ("mining" it or handing it over to someone else included), we
>>>> could not do anything about it anyway, nor would I even care if I was in
>>>> that list or not, since there seems to be no ToS between the one
>>> submitting
>>>> that information and the one publishing it. Since it was freely submitted
>>>> without any terms, I can simply not find any restrictions on its usage.
>>>>
>>>> Yes, we should have a ToS to at least keep the principle of privacy
>>> alive.
>>>> But let's face it, real privacy online has been dead for long, if it ever
>>>> existed.
>>>>
>>>> If there was a ToS, the situation would perhaps have been different, at
>>>> least legally. I'm no legal expert of course, but to me it makes perfect
>>>> sense that if you posted something on the internet, in a very public
>>> space,
>>>> you can have no expectations of keeping any of that information private
>>> in
>>>> any way, nor any information easily associated with.
>>>> No, I don't see that as a problem, at least not if you never explicitly
>>>> agreed that information would not be shared. What I really want to keep
>>>> private I don't post anywhere.
>>>>
>>>
>>> I think the point here is not so much privacy, as I believe everyone
>>> recognizes that the information that was asked for (the full list of
>>> usernames) is public and can be scraped.
>>>
>>> The point here is handing over the full list of usernames on request. Do
>>> note that in their research proposal[1] they specifically mention
>>> scraping information from github. That information is public, github
>>> does have an API to query that information, but they still have to
>>> scrape it, I suppose that implies github does not hand it over wholesale
>>> on request, why should we? This might be due to their ToS or they know
>>> something we don't.
>>>
>> It would be rather interesting to see what they could come up with from
>> that correlation.
>
> Probably nothing meaningful. As I've said before you have no way of
> knowing if user foo on github is the same as user foo on the AUR.
>
True, but you could make a decent guess based on how many coincidences
there are surrounding those names.
Relations between names could be interesting even if the people behind
them are not the same.

>> I think, perhaps a bit cynically, the reason github may not hand over that
>> data directly is likely that they don't want to do some of the work of the
>> researchers for them. As you said, the data is there, the format matters
>> less if they're going to massage it into something else later anyway, so
>> why bother with the effort of compiling it on their [github] own time?
>>
>> We could simply deny the AUR username request it for the same reason, or no
>> reason at all. Since some people seem uncomfortable about what could be
>> derived from a potential correlation of publicly available data, that's
>> most likely the safest way to go.
>>
>
>
> --
> Mauro Santos


More information about the arch-general mailing list