On Fri, Sep 10, 2010 at 3:27 PM, Pierre Schmitz <pierre@archlinux.de> wrote:
On Fri, 10 Sep 2010 16:16:46 -0400, Daenyth Blank <daenyth+arch@gmail.com> wrote:
On Fri, Sep 10, 2010 at 16:15, Ionuț Bîru <ibiru@archlinux.org> wrote:
i noticed this myself when i tried to submit the data from other machine in my network.
like an idea we can use the UUID from the root partition Maybe.. Would it make more sense to take a hash of the eth0 mac address? Not sure if that is sensible... I guess the UUID doesn't change that often.
Well, we have discussed all this before. If I don't limit the submission by ip it will be too easy for a single person to flood us with false data making the whole stats pointless. The ip is the only value you cannot easily spoof over internet.
Sure- I'm not saying don't validate IP addresses at all, but the limit should probably be higher than 1 submission per IP in the given time frame.
Whatever we would implement on the client side (pkgstats) doesn't matter as you still can post your data directly or just modify the script. (and yes, client ssl certs are overkill and people wont use pkgstats)
One thing I could do though is to allow more than one submission per ip and day. what would be a reasonable value? Like 10 submission per ip within 24h?
What about something like this: 1. Submit something "unique" but relatively harmless- first network device MAC address seems reasonable. Root UUID would probably be a bit more work. 2. This suggestion forms a (IP, MAC) combo. If we've seen it before, let it through- what does it matter? We should just update the statistics list for this guy. 3. Same IP, new MAC, and MAC is nowhere else in system- let it through if we haven't had more than X (5? 10?) submissions in the last 24h from this IP. 4. Same IP, new MAC, MAC is already in system- update the stored IP address of the system entry, allow submission through overwriting old submission. 5. Different IP, MAC already in system- same as above in 4- change the system entry and then allow submission, replacing old values. 6. And so on- we can "trust" IP address, we can't trust MAC address. Every month, cull the stats- if we haven't heard from you in two months, you are removed from the counted values. Gather submissions once a week or so. Thus someone that wanted to poison the stats would have to keep up with the submissions from all of their bogus MAC addresses. -Dan