[aur-dev] [PATCH] valid_email :: add dns check
When an email address is of the correct format check if there is an mx record for the domain part. This forces people wanting to trash something in aur to use a domain part with an mx record. This will not stop invalid email, it only adds an extra check and a possible help for typos. Signed-off-by: BlackEagle <ike.devolder@gmail.com> --- web/lib/aur.inc.php | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/web/lib/aur.inc.php b/web/lib/aur.inc.php index c662b80..7e9d7de 100644 --- a/web/lib/aur.inc.php +++ b/web/lib/aur.inc.php @@ -80,7 +80,14 @@ function check_sid($dbh=NULL) { # verify that an email address looks like it is legitimate # function valid_email($addy) { - return (filter_var($addy, FILTER_VALIDATE_EMAIL) !== false); + if (filter_var($addy, FILTER_VALIDATE_EMAIL) === false) + return false; + + $domain = substr($addy, strrpos($addy, '@')+1); + if (!checkdnsrr($domain, 'MX') || checkdnsrr($domain, 'A')) + return false; + + return true; } # a new seed value for mt_srand() -- 1.7.9.4
On Mon, Mar 19, 2012 at 09:48:36PM +0100, BlackEagle wrote:
When an email address is of the correct format check if there is an mx record for the domain part. This forces people wanting to trash something in aur to use a domain part with an mx record.
This will not stop invalid email, it only adds an extra check and a possible help for typos.
Signed-off-by: BlackEagle <ike.devolder@gmail.com> --- web/lib/aur.inc.php | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/web/lib/aur.inc.php b/web/lib/aur.inc.php index c662b80..7e9d7de 100644 --- a/web/lib/aur.inc.php +++ b/web/lib/aur.inc.php @@ -80,7 +80,14 @@ function check_sid($dbh=NULL) { # verify that an email address looks like it is legitimate # function valid_email($addy) { - return (filter_var($addy, FILTER_VALIDATE_EMAIL) !== false); + if (filter_var($addy, FILTER_VALIDATE_EMAIL) === false) + return false;
This isn't noted down for the AUR yet but we always use opening and closing braces in other Arch projects, such as pacman (even if it's just a one-line block). It probably makes sense to add such a rule to our coding guidelines. Could you please change that when resubmitting the patch? I'll take care of adding that to our guidelines later...
+ + $domain = substr($addy, strrpos($addy, '@')+1);
Please add spaces before and after the plus sign (it should look like "strrpos($addy, '@') + 1").
+ if (!checkdnsrr($domain, 'MX') || checkdnsrr($domain, 'A'))
That check doesn't look correct to me. Why would we reject mail addresses if there's an A record for the domain part? You probably forgot parentheses and/or a negation ("!") here. Also, I'm not sure why you're checking for A records at all? Could you please elaborate on this? Oh, and don't forget to add braces here as well :)
+ return false; + + return true; }
# a new seed value for mt_srand() -- 1.7.9.4
When an email address is of the correct format check if there is an mx record for the domain part. This forces people wanting to trash something in aur to use a domain part with an mx record. This will not stop invalid email, it only adds an extra check and a possible help for typos. Signed-off-by: BlackEagle <ike.devolder@gmail.com> --- web/lib/aur.inc.php | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/web/lib/aur.inc.php b/web/lib/aur.inc.php index c662b80..48fb860 100644 --- a/web/lib/aur.inc.php +++ b/web/lib/aur.inc.php @@ -80,7 +80,16 @@ function check_sid($dbh=NULL) { # verify that an email address looks like it is legitimate # function valid_email($addy) { - return (filter_var($addy, FILTER_VALIDATE_EMAIL) !== false); + if (filter_var($addy, FILTER_VALIDATE_EMAIL) === false) { + return false; + } + + $domain = substr($addy, strrpos($addy, '@') + 1); + if (checkdnsrr($domain, 'MX') === false) { + return false; + } + + return true; } # a new seed value for mt_srand() -- 1.7.9.4
Op maandag 19 maart 2012 22:30:55 schreef u:
When an email address is of the correct format check if there is an mx record for the domain part. This forces people wanting to trash something in aur to use a domain part with an mx record.
This will not stop invalid email, it only adds an extra check and a possible help for typos.
Signed-off-by: BlackEagle <ike.devolder@gmail.com> --- web/lib/aur.inc.php | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/web/lib/aur.inc.php b/web/lib/aur.inc.php index c662b80..48fb860 100644 --- a/web/lib/aur.inc.php +++ b/web/lib/aur.inc.php @@ -80,7 +80,16 @@ function check_sid($dbh=NULL) { # verify that an email address looks like it is legitimate # function valid_email($addy) { - return (filter_var($addy, FILTER_VALIDATE_EMAIL) !== false); + if (filter_var($addy, FILTER_VALIDATE_EMAIL) === false) { + return false; + } + + $domain = substr($addy, strrpos($addy, '@') + 1); + if (checkdnsrr($domain, 'MX') === false) { + return false; + } + + return true; }
# a new seed value for mt_srand() -- 1.7.9.4
modified patch, with only MX record check, removed the A record check since it was unnecessary. --Ike
On Mon, Mar 19, 2012 at 11:14 PM, Lukas Fleischer <archlinux@cryptocrack.de>wrote:
On Mon, Mar 19, 2012 at 09:48:36PM +0100, BlackEagle wrote:
When an email address is of the correct format check if there is an mx record for the domain part. This forces people wanting to trash something in aur to use a domain part with an mx record.
This will not stop invalid email, it only adds an extra check and a possible help for typos.
Signed-off-by: BlackEagle <ike.devolder@gmail.com> --- web/lib/aur.inc.php | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/web/lib/aur.inc.php b/web/lib/aur.inc.php index c662b80..7e9d7de 100644 --- a/web/lib/aur.inc.php +++ b/web/lib/aur.inc.php @@ -80,7 +80,14 @@ function check_sid($dbh=NULL) { # verify that an email address looks like it is legitimate # function valid_email($addy) { - return (filter_var($addy, FILTER_VALIDATE_EMAIL) !== false); + if (filter_var($addy, FILTER_VALIDATE_EMAIL) === false) + return false;
This isn't noted down for the AUR yet but we always use opening and closing braces in other Arch projects, such as pacman (even if it's just a one-line block). It probably makes sense to add such a rule to our coding guidelines.
Could you please change that when resubmitting the patch? I'll take care of adding that to our guidelines later...
+ + $domain = substr($addy, strrpos($addy, '@')+1);
Please add spaces before and after the plus sign (it should look like "strrpos($addy, '@') + 1").
+ if (!checkdnsrr($domain, 'MX') || checkdnsrr($domain, 'A'))
That check doesn't look correct to me. Why would we reject mail addresses if there's an A record for the domain part? You probably forgot parentheses and/or a negation ("!") here.
Also, I'm not sure why you're checking for A records at all? Could you please elaborate on this?
Email delivery falls back to where the A record points if an MX record does not exist. https://en.wikipedia.org/wiki/MX_record#History_of_fallback_to_A I think the correct check should be: if (! (checkdnsrr($domain, 'MX') || checkdnsrr($domain, 'A'))) { return false; }
Oh, and don't forget to add braces here as well :)
+ return false; + + return true; }
# a new seed value for mt_srand() -- 1.7.9.4
When an email address is of the correct format check if there is an mx record for the domain part. This forces people wanting to trash something in aur to use a domain part with an mx record. This will not stop invalid email, it only adds an extra check and a possible help for typos. Signed-off-by: BlackEagle <ike.devolder@gmail.com> --- web/lib/aur.inc.php | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/web/lib/aur.inc.php b/web/lib/aur.inc.php index c662b80..c6eb954 100644 --- a/web/lib/aur.inc.php +++ b/web/lib/aur.inc.php @@ -80,7 +80,16 @@ function check_sid($dbh=NULL) { # verify that an email address looks like it is legitimate # function valid_email($addy) { - return (filter_var($addy, FILTER_VALIDATE_EMAIL) !== false); + if (filter_var($addy, FILTER_VALIDATE_EMAIL) === false) { + return false; + } + + $domain = substr($addy, strrpos($addy, '@') + 1); + if (! (checkdnsrr($domain, 'MX') || checkdnsrr($domain, 'A'))) { + return false; + } + + return true; } # a new seed value for mt_srand() -- 1.7.9.4
On Mon, Mar 19, 2012 at 11:33:29PM +0200, Slavi Pantaleev wrote:
On Mon, Mar 19, 2012 at 11:14 PM, Lukas Fleischer <archlinux@cryptocrack.de>wrote:
On Mon, Mar 19, 2012 at 09:48:36PM +0100, BlackEagle wrote:
When an email address is of the correct format check if there is an mx record for the domain part. This forces people wanting to trash something in aur to use a domain part with an mx record.
This will not stop invalid email, it only adds an extra check and a possible help for typos.
Signed-off-by: BlackEagle <ike.devolder@gmail.com> --- web/lib/aur.inc.php | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/web/lib/aur.inc.php b/web/lib/aur.inc.php index c662b80..7e9d7de 100644 --- a/web/lib/aur.inc.php +++ b/web/lib/aur.inc.php @@ -80,7 +80,14 @@ function check_sid($dbh=NULL) { # verify that an email address looks like it is legitimate # function valid_email($addy) { - return (filter_var($addy, FILTER_VALIDATE_EMAIL) !== false); + if (filter_var($addy, FILTER_VALIDATE_EMAIL) === false) + return false;
This isn't noted down for the AUR yet but we always use opening and closing braces in other Arch projects, such as pacman (even if it's just a one-line block). It probably makes sense to add such a rule to our coding guidelines.
Could you please change that when resubmitting the patch? I'll take care of adding that to our guidelines later...
+ + $domain = substr($addy, strrpos($addy, '@')+1);
Please add spaces before and after the plus sign (it should look like "strrpos($addy, '@') + 1").
+ if (!checkdnsrr($domain, 'MX') || checkdnsrr($domain, 'A'))
That check doesn't look correct to me. Why would we reject mail addresses if there's an A record for the domain part? You probably forgot parentheses and/or a negation ("!") here.
Also, I'm not sure why you're checking for A records at all? Could you please elaborate on this?
Email delivery falls back to where the A record points if an MX record does not exist. https://en.wikipedia.org/wiki/MX_record#History_of_fallback_to_A
I think the correct check should be:
if (! (checkdnsrr($domain, 'MX') || checkdnsrr($domain, 'A'))) { return false; }
What about AAAA records? Shouldn't we check for these as well?
Oh, and don't forget to add braces here as well :)
+ return false; + + return true; }
# a new seed value for mt_srand() -- 1.7.9.4
Not sure. On Mon, Mar 19, 2012 at 11:41 PM, Lukas Fleischer <archlinux@cryptocrack.de>wrote:
On Mon, Mar 19, 2012 at 11:33:29PM +0200, Slavi Pantaleev wrote:
On Mon, Mar 19, 2012 at 11:14 PM, Lukas Fleischer <archlinux@cryptocrack.de>wrote:
On Mon, Mar 19, 2012 at 09:48:36PM +0100, BlackEagle wrote:
When an email address is of the correct format check if there is an mx record for the domain part. This forces people wanting to trash something in aur to use a domain part with an mx record.
This will not stop invalid email, it only adds an extra check and a possible help for typos.
Signed-off-by: BlackEagle <ike.devolder@gmail.com> --- web/lib/aur.inc.php | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/web/lib/aur.inc.php b/web/lib/aur.inc.php index c662b80..7e9d7de 100644 --- a/web/lib/aur.inc.php +++ b/web/lib/aur.inc.php @@ -80,7 +80,14 @@ function check_sid($dbh=NULL) { # verify that an email address looks like it is legitimate # function valid_email($addy) { - return (filter_var($addy, FILTER_VALIDATE_EMAIL) !== false); + if (filter_var($addy, FILTER_VALIDATE_EMAIL) === false) + return false;
This isn't noted down for the AUR yet but we always use opening and closing braces in other Arch projects, such as pacman (even if it's just a one-line block). It probably makes sense to add such a rule to our coding guidelines.
Could you please change that when resubmitting the patch? I'll take care of adding that to our guidelines later...
+ + $domain = substr($addy, strrpos($addy, '@')+1);
Please add spaces before and after the plus sign (it should look like "strrpos($addy, '@') + 1").
+ if (!checkdnsrr($domain, 'MX') || checkdnsrr($domain, 'A'))
That check doesn't look correct to me. Why would we reject mail addresses if there's an A record for the domain part? You probably forgot parentheses and/or a negation ("!") here.
Also, I'm not sure why you're checking for A records at all? Could you please elaborate on this?
Email delivery falls back to where the A record points if an MX record does not exist. https://en.wikipedia.org/wiki/MX_record#History_of_fallback_to_A
I think the correct check should be:
if (! (checkdnsrr($domain, 'MX') || checkdnsrr($domain, 'A'))) { return false; }
What about AAAA records? Shouldn't we check for these as well?
It doesn't seem like the Wikipedia article says anything about those. It probably doesn't matter much at this point. I don't think there are many mail servers now that have no MX/A records, but have IPv6 connectivity.
Oh, and don't forget to add braces here as well :)
+ return false; + + return true; }
# a new seed value for mt_srand() -- 1.7.9.4
- check if the format is valid - go and connect to the smtp server of the given domain and verify if the given email exists there --- web/lib/aur.inc.php | 75 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 files changed, 74 insertions(+), 1 deletions(-) diff --git a/web/lib/aur.inc.php b/web/lib/aur.inc.php index c662b80..3fc0a14 100644 --- a/web/lib/aur.inc.php +++ b/web/lib/aur.inc.php @@ -80,7 +80,80 @@ function check_sid($dbh=NULL) { # verify that an email address looks like it is legitimate # function valid_email($addy) { - return (filter_var($addy, FILTER_VALIDATE_EMAIL) !== false); + // check against RFC 3696 + if(filter_var($addy, FILTER_VALIDATE_EMAIL) === false) { + return false; + } + + // check dns for mx, a, aaaa records + list($local, $domain) = explode('@', $addy); + if(! (checkdnsrr($domain, 'MX') || checkdnsrr($domain, 'A') || checkdnsrr($domain, 'AAAA'))) { + return false; + } + + // get mx records and check full email address + $mxlist = array(); + $mxweight = array(); + getmxrr($domain, $mxlist, $mxweight); + $mx = array_combine($mxweight, $mxlist); + ksort($mx); + + //smtp_test_email($addy, current($mx)); + foreach($mx as $prio => $mxsrv) { + if(smtp_test_email($addy, $mxsrv) === true) { + return true; + } + } + + return false; +} + +# verify that an email address exists on the smtp server +# +function smtp_test_email($addy, $mxsrv) { + if(($smtp = fsockopen($mxsrv, 25)) === false) { + return false; + } + + if(intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp))) !== 220) { + smtp_close($smtp); + return false; + } + + fwrite($smtp, "HELO $mxsrv\r\n"); + if(intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp))) !== 250) { + smtp_close($smtp); + return false; + } + + fwrite($smtp, "MAIL FROM: <mailtest@archlinux.org>\r\n"); + if(intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp))) !== 250) { + smtp_close($smtp); + return false; + } + + fwrite($smtp, "RCPT TO: <$addy>\r\n"); + $code = intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp))); + /** + * 250 = success + * 451 or 452 = address got greylisted but another error occured + * so assume ok + */ + if($code !== 250 && $code !== 451 && $code !== 452) { + smtp_close($smtp); + return false; + } + + smtp_close($smtp); + return true; +} + +# close smtp conneciton +# +function smtp_close(&$smtp) { + fwrite($smtp, "RSET\r\n"); + fwrite($smtp, "QUIT\r\n"); + fclose($smtp); } # a new seed value for mt_srand() -- 1.7.9.1
On Tue, Mar 20, 2012 at 10:36:11AM +0100, BlackEagle wrote:
- check if the format is valid - go and connect to the smtp server of the given domain and verify if the given email exists there
I'm still not convinced that this is the right way to go. This check is quite technical and looks error-prone, while it still allows anyone to use arbitrary mail addresses (e.g. "archlinux@cryptocrack.de" or random mail addresses from an email address list created by a crawler). As I said on IRC, there's probably only two (proper) ways to fix this: * Send verification mails. * Use captchas. I'll still add some comments to your patch below...
--- web/lib/aur.inc.php | 75 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 files changed, 74 insertions(+), 1 deletions(-)
diff --git a/web/lib/aur.inc.php b/web/lib/aur.inc.php index c662b80..3fc0a14 100644 --- a/web/lib/aur.inc.php +++ b/web/lib/aur.inc.php @@ -80,7 +80,80 @@ function check_sid($dbh=NULL) { # verify that an email address looks like it is legitimate # function valid_email($addy) { - return (filter_var($addy, FILTER_VALIDATE_EMAIL) !== false); + // check against RFC 3696
Please use tabs, not spaces, for indentation.
+ if(filter_var($addy, FILTER_VALIDATE_EMAIL) === false) { + return false; + } + + // check dns for mx, a, aaaa records + list($local, $domain) = explode('@', $addy); + if(! (checkdnsrr($domain, 'MX') || checkdnsrr($domain, 'A') || checkdnsrr($domain, 'AAAA'))) { + return false; + }
This check could be dropped if we actually decided for this complicated method. We check for MX records below.
+ + // get mx records and check full email address + $mxlist = array(); + $mxweight = array(); + getmxrr($domain, $mxlist, $mxweight); + $mx = array_combine($mxweight, $mxlist); + ksort($mx);
You checked for A and AAAA records above but don't add them to the "$mx" array here?
+ + //smtp_test_email($addy, current($mx));
Please strip commented code from patches.
+ foreach($mx as $prio => $mxsrv) { + if(smtp_test_email($addy, $mxsrv) === true) { + return true; + } + } + + return false; +} + +# verify that an email address exists on the smtp server +# +function smtp_test_email($addy, $mxsrv) { + if(($smtp = fsockopen($mxsrv, 25)) === false) { + return false; + } + + if(intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp))) !== 220) {
Better use substr() here: if (intval(substr(fgets($smtp), 0, 3)) !== 220) { [...] Also, what about other error codes, such as 421?
+ smtp_close($smtp); + return false; + } + + fwrite($smtp, "HELO $mxsrv\r\n"); + if(intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp))) !== 250) {
Same here.
+ smtp_close($smtp); + return false; + } + + fwrite($smtp, "MAIL FROM: <mailtest@archlinux.org>\r\n"); + if(intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp))) !== 250) {
Same here.
+ smtp_close($smtp); + return false; + } + + fwrite($smtp, "RCPT TO: <$addy>\r\n"); + $code = intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp)));
Same here.
+ /** + * 250 = success + * 451 or 452 = address got greylisted but another error occured + * so assume ok + */ + if($code !== 250 && $code !== 451 && $code !== 452) {
What about 251? What about 450? This is what I don't like about this patch... There's a dozens of corner cases to consider and I don't think this is the right place to check for these.
+ smtp_close($smtp); + return false; + } + + smtp_close($smtp); + return true; +} + +# close smtp conneciton +# +function smtp_close(&$smtp) { + fwrite($smtp, "RSET\r\n"); + fwrite($smtp, "QUIT\r\n"); + fclose($smtp); }
# a new seed value for mt_srand() -- 1.7.9.1
Contacting the mail server to verify that the mailbox exists seems a little excessive. I think just checking for an MX/A record is good in catching some obviously fake addresses and reliable enough. Maybe we should just stop there. On Tue, Mar 20, 2012 at 12:31 PM, Lukas Fleischer <archlinux@cryptocrack.de>wrote:
On Tue, Mar 20, 2012 at 10:36:11AM +0100, BlackEagle wrote:
- check if the format is valid - go and connect to the smtp server of the given domain and verify if the given email exists there
I'm still not convinced that this is the right way to go. This check is quite technical and looks error-prone, while it still allows anyone to use arbitrary mail addresses (e.g. "archlinux@cryptocrack.de" or random mail addresses from an email address list created by a crawler).
As I said on IRC, there's probably only two (proper) ways to fix this:
* Send verification mails. * Use captchas.
I'll still add some comments to your patch below...
--- web/lib/aur.inc.php | 75 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 files changed, 74 insertions(+), 1 deletions(-)
diff --git a/web/lib/aur.inc.php b/web/lib/aur.inc.php index c662b80..3fc0a14 100644 --- a/web/lib/aur.inc.php +++ b/web/lib/aur.inc.php @@ -80,7 +80,80 @@ function check_sid($dbh=NULL) { # verify that an email address looks like it is legitimate # function valid_email($addy) { - return (filter_var($addy, FILTER_VALIDATE_EMAIL) !== false); + // check against RFC 3696
Please use tabs, not spaces, for indentation.
+ if(filter_var($addy, FILTER_VALIDATE_EMAIL) === false) { + return false; + } + + // check dns for mx, a, aaaa records + list($local, $domain) = explode('@', $addy); + if(! (checkdnsrr($domain, 'MX') || checkdnsrr($domain, 'A') || checkdnsrr($domain, 'AAAA'))) { + return false; + }
This check could be dropped if we actually decided for this complicated method. We check for MX records below.
+ + // get mx records and check full email address + $mxlist = array(); + $mxweight = array(); + getmxrr($domain, $mxlist, $mxweight); + $mx = array_combine($mxweight, $mxlist); + ksort($mx);
You checked for A and AAAA records above but don't add them to the "$mx" array here?
+ + //smtp_test_email($addy, current($mx));
Please strip commented code from patches.
+ foreach($mx as $prio => $mxsrv) { + if(smtp_test_email($addy, $mxsrv) === true) { + return true; + } + } + + return false; +} + +# verify that an email address exists on the smtp server +# +function smtp_test_email($addy, $mxsrv) { + if(($smtp = fsockopen($mxsrv, 25)) === false) { + return false; + } + + if(intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp))) !== 220) {
Better use substr() here:
if (intval(substr(fgets($smtp), 0, 3)) !== 220) { [...]
Also, what about other error codes, such as 421?
+ smtp_close($smtp); + return false; + } + + fwrite($smtp, "HELO $mxsrv\r\n"); + if(intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp))) !== 250) {
Same here.
+ smtp_close($smtp); + return false; + } + + fwrite($smtp, "MAIL FROM: <mailtest@archlinux.org>\r\n"); + if(intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp))) !== 250) {
Same here.
+ smtp_close($smtp); + return false; + } + + fwrite($smtp, "RCPT TO: <$addy>\r\n"); + $code = intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp)));
Same here.
+ /** + * 250 = success + * 451 or 452 = address got greylisted but another error occured + * so assume ok + */ + if($code !== 250 && $code !== 451 && $code !== 452) {
What about 251? What about 450?
This is what I don't like about this patch... There's a dozens of corner cases to consider and I don't think this is the right place to check for these.
+ smtp_close($smtp); + return false; + } + + smtp_close($smtp); + return true; +} + +# close smtp conneciton +# +function smtp_close(&$smtp) { + fwrite($smtp, "RSET\r\n"); + fwrite($smtp, "QUIT\r\n"); + fclose($smtp); }
# a new seed value for mt_srand() -- 1.7.9.1
On Tue, Mar 20, 2012 at 11:31:32AM +0100, Lukas Fleischer wrote:
On Tue, Mar 20, 2012 at 10:36:11AM +0100, BlackEagle wrote:
- check if the format is valid - go and connect to the smtp server of the given domain and verify if the given email exists there
I'm still not convinced that this is the right way to go. This check is quite technical and looks error-prone, while it still allows anyone to use arbitrary mail addresses (e.g. "archlinux@cryptocrack.de" or random mail addresses from an email address list created by a crawler).
As I said on IRC, there's probably only two (proper) ways to fix this:
* Send verification mails.
is the best solution, but this patch could be an intermediate until the verification stuff is in place
* Use captchas.
this is only annoying the user, and rendered useless when using verification mail.
I'll still add some comments to your patch below...
--- web/lib/aur.inc.php | 75 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 files changed, 74 insertions(+), 1 deletions(-)
diff --git a/web/lib/aur.inc.php b/web/lib/aur.inc.php index c662b80..3fc0a14 100644 --- a/web/lib/aur.inc.php +++ b/web/lib/aur.inc.php @@ -80,7 +80,80 @@ function check_sid($dbh=NULL) { # verify that an email address looks like it is legitimate # function valid_email($addy) { - return (filter_var($addy, FILTER_VALIDATE_EMAIL) !== false); + // check against RFC 3696
Please use tabs, not spaces, for indentation.
sorry was on another configuration since i have to use spaces there
+ if(filter_var($addy, FILTER_VALIDATE_EMAIL) === false) { + return false; + } + + // check dns for mx, a, aaaa records + list($local, $domain) = explode('@', $addy); + if(! (checkdnsrr($domain, 'MX') || checkdnsrr($domain, 'A') || checkdnsrr($domain, 'AAAA'))) { + return false; + }
This check could be dropped if we actually decided for this complicated method. We check for MX records below.
+ + // get mx records and check full email address + $mxlist = array(); + $mxweight = array(); + getmxrr($domain, $mxlist, $mxweight); + $mx = array_combine($mxweight, $mxlist); + ksort($mx);
You checked for A and AAAA records above but don't add them to the "$mx" array here?
+ + //smtp_test_email($addy, current($mx));
Please strip commented code from patches.
sorry
+ foreach($mx as $prio => $mxsrv) { + if(smtp_test_email($addy, $mxsrv) === true) { + return true; + } + } + + return false; +} + +# verify that an email address exists on the smtp server +# +function smtp_test_email($addy, $mxsrv) { + if(($smtp = fsockopen($mxsrv, 25)) === false) { + return false; + } + + if(intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp))) !== 220) {
Better use substr() here:
if (intval(substr(fgets($smtp), 0, 3)) !== 220) { [...]
Also, what about other error codes, such as 421?
i think we can consider 421 as an error since we want to verify the complete email address
+ smtp_close($smtp); + return false; + } + + fwrite($smtp, "HELO $mxsrv\r\n"); + if(intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp))) !== 250) {
Same here.
+ smtp_close($smtp); + return false; + } + + fwrite($smtp, "MAIL FROM: <mailtest@archlinux.org>\r\n"); + if(intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp))) !== 250) {
Same here.
+ smtp_close($smtp); + return false; + } + + fwrite($smtp, "RCPT TO: <$addy>\r\n"); + $code = intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp)));
Same here.
+ /** + * 250 = success + * 451 or 452 = address got greylisted but another error occured + * so assume ok + */ + if($code !== 250 && $code !== 451 && $code !== 452) {
What about 251? What about 450?
251 should be taken in then 551 comes in the picture and both would require this function to be run with the returned <forward-path> when we follow the forward-path stuff then we could run out of time if the following ones are also forwarding 450 is considered an error so i would let it fail there
This is what I don't like about this patch... There's a dozens of corner cases to consider and I don't think this is the right place to check for these.
+ smtp_close($smtp); + return false; + } + + smtp_close($smtp); + return true; +} + +# close smtp conneciton +# +function smtp_close(&$smtp) { + fwrite($smtp, "RSET\r\n"); + fwrite($smtp, "QUIT\r\n"); + fclose($smtp); }
# a new seed value for mt_srand() -- 1.7.9.1
-- Ike
On Tue, Mar 20, 2012 at 12:06:03PM +0100, Ike Devolder wrote:
On Tue, Mar 20, 2012 at 11:31:32AM +0100, Lukas Fleischer wrote:
On Tue, Mar 20, 2012 at 10:36:11AM +0100, BlackEagle wrote:
- check if the format is valid - go and connect to the smtp server of the given domain and verify if the given email exists there
I'm still not convinced that this is the right way to go. This check is quite technical and looks error-prone, while it still allows anyone to use arbitrary mail addresses (e.g. "archlinux@cryptocrack.de" or random mail addresses from an email address list created by a crawler).
As I said on IRC, there's probably only two (proper) ways to fix this:
* Send verification mails.
is the best solution, but this patch could be an intermediate until the verification stuff is in place
To be honest, I would prefer the first version (the one that checks MX/A/AAAA records only and doesn't replace the filter_var() check) of your patch as an intermediate. This is "good enough" for now - at least until someone steps up and implements email verification. You probably noticed yourself that dealing with all SMTP corner cases is tricky, error-prone and a pain to maintain; and, as I said before, this can be bypassed just as easy as the MX record check.
* Use captchas.
this is only annoying the user, and rendered useless when using verification mail.
I'll still add some comments to your patch below...
--- web/lib/aur.inc.php | 75 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 files changed, 74 insertions(+), 1 deletions(-)
diff --git a/web/lib/aur.inc.php b/web/lib/aur.inc.php index c662b80..3fc0a14 100644 --- a/web/lib/aur.inc.php +++ b/web/lib/aur.inc.php @@ -80,7 +80,80 @@ function check_sid($dbh=NULL) { # verify that an email address looks like it is legitimate # function valid_email($addy) { - return (filter_var($addy, FILTER_VALIDATE_EMAIL) !== false); + // check against RFC 3696
Please use tabs, not spaces, for indentation.
sorry was on another configuration since i have to use spaces there
+ if(filter_var($addy, FILTER_VALIDATE_EMAIL) === false) { + return false; + } + + // check dns for mx, a, aaaa records + list($local, $domain) = explode('@', $addy); + if(! (checkdnsrr($domain, 'MX') || checkdnsrr($domain, 'A') || checkdnsrr($domain, 'AAAA'))) { + return false; + }
This check could be dropped if we actually decided for this complicated method. We check for MX records below.
+ + // get mx records and check full email address + $mxlist = array(); + $mxweight = array(); + getmxrr($domain, $mxlist, $mxweight); + $mx = array_combine($mxweight, $mxlist); + ksort($mx);
You checked for A and AAAA records above but don't add them to the "$mx" array here?
+ + //smtp_test_email($addy, current($mx));
Please strip commented code from patches.
sorry
+ foreach($mx as $prio => $mxsrv) { + if(smtp_test_email($addy, $mxsrv) === true) { + return true; + } + } + + return false; +} + +# verify that an email address exists on the smtp server +# +function smtp_test_email($addy, $mxsrv) { + if(($smtp = fsockopen($mxsrv, 25)) === false) { + return false; + } + + if(intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp))) !== 220) {
Better use substr() here:
if (intval(substr(fgets($smtp), 0, 3)) !== 220) { [...]
Also, what about other error codes, such as 421?
i think we can consider 421 as an error since we want to verify the complete email address
+ smtp_close($smtp); + return false; + } + + fwrite($smtp, "HELO $mxsrv\r\n"); + if(intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp))) !== 250) {
Same here.
+ smtp_close($smtp); + return false; + } + + fwrite($smtp, "MAIL FROM: <mailtest@archlinux.org>\r\n"); + if(intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp))) !== 250) {
Same here.
+ smtp_close($smtp); + return false; + } + + fwrite($smtp, "RCPT TO: <$addy>\r\n"); + $code = intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp)));
Same here.
+ /** + * 250 = success + * 451 or 452 = address got greylisted but another error occured + * so assume ok + */ + if($code !== 250 && $code !== 451 && $code !== 452) {
What about 251? What about 450?
251 should be taken in then 551 comes in the picture and both would require this function to be run with the returned <forward-path> when we follow the forward-path stuff then we could run out of time if the following ones are also forwarding
450 is considered an error so i would let it fail there
This is what I don't like about this patch... There's a dozens of corner cases to consider and I don't think this is the right place to check for these.
+ smtp_close($smtp); + return false; + } + + smtp_close($smtp); + return true; +} + +# close smtp conneciton +# +function smtp_close(&$smtp) { + fwrite($smtp, "RSET\r\n"); + fwrite($smtp, "QUIT\r\n"); + fclose($smtp); }
# a new seed value for mt_srand() -- 1.7.9.1
-- Ike
2012/3/20 Lukas Fleischer <archlinux@cryptocrack.de>
On Tue, Mar 20, 2012 at 12:06:03PM +0100, Ike Devolder wrote:
On Tue, Mar 20, 2012 at 11:31:32AM +0100, Lukas Fleischer wrote:
On Tue, Mar 20, 2012 at 10:36:11AM +0100, BlackEagle wrote:
- check if the format is valid - go and connect to the smtp server of the given domain and verify if the given email exists there
I'm still not convinced that this is the right way to go. This check is quite technical and looks error-prone, while it still allows anyone to use arbitrary mail addresses (e.g. "archlinux@cryptocrack.de" or random mail addresses from an email address list created by a crawler).
As I said on IRC, there's probably only two (proper) ways to fix this:
* Send verification mails.
is the best solution, but this patch could be an intermediate until the verification stuff is in place
To be honest, I would prefer the first version (the one that checks MX/A/AAAA records only and doesn't replace the filter_var() check) of your patch as an intermediate. This is "good enough" for now - at least until someone steps up and implements email verification. You probably noticed yourself that dealing with all SMTP corner cases is tricky, error-prone and a pain to maintain; and, as I said before, this can be bypassed just as easy as the MX record check.
* Use captchas.
this is only annoying the user, and rendered useless when using verification mail.
I'll still add some comments to your patch below...
--- web/lib/aur.inc.php | 75
++++++++++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 74 insertions(+), 1 deletions(-)
diff --git a/web/lib/aur.inc.php b/web/lib/aur.inc.php index c662b80..3fc0a14 100644 --- a/web/lib/aur.inc.php +++ b/web/lib/aur.inc.php @@ -80,7 +80,80 @@ function check_sid($dbh=NULL) { # verify that an email address looks like it is legitimate # function valid_email($addy) { - return (filter_var($addy, FILTER_VALIDATE_EMAIL) !== false); + // check against RFC 3696
Please use tabs, not spaces, for indentation.
sorry was on another configuration since i have to use spaces there
+ if(filter_var($addy, FILTER_VALIDATE_EMAIL) === false) { + return false; + } + + // check dns for mx, a, aaaa records + list($local, $domain) = explode('@', $addy); + if(! (checkdnsrr($domain, 'MX') || checkdnsrr($domain, 'A') ||
checkdnsrr($domain, 'AAAA'))) {
+ return false; + }
This check could be dropped if we actually decided for this complicated method. We check for MX records below.
+ + // get mx records and check full email address + $mxlist = array(); + $mxweight = array(); + getmxrr($domain, $mxlist, $mxweight); + $mx = array_combine($mxweight, $mxlist); + ksort($mx);
You checked for A and AAAA records above but don't add them to the "$mx" array here?
+ + //smtp_test_email($addy, current($mx));
Please strip commented code from patches.
sorry
+ foreach($mx as $prio => $mxsrv) { + if(smtp_test_email($addy, $mxsrv) === true) { + return true; + } + } + + return false; +} + +# verify that an email address exists on the smtp server +# +function smtp_test_email($addy, $mxsrv) { + if(($smtp = fsockopen($mxsrv, 25)) === false) { + return false; + } + + if(intval(preg_replace('/^\([0-9]{3}\).*/', '\1',
fgets($smtp))) !== 220) {
Better use substr() here:
if (intval(substr(fgets($smtp), 0, 3)) !== 220) { [...]
Also, what about other error codes, such as 421?
i think we can consider 421 as an error since we want to verify the complete email address
+ smtp_close($smtp); + return false; + } + + fwrite($smtp, "HELO $mxsrv\r\n"); + if(intval(preg_replace('/^\([0-9]{3}\).*/', '\1',
fgets($smtp))) !== 250) {
Same here.
+ smtp_close($smtp); + return false; + } + + fwrite($smtp, "MAIL FROM: <mailtest@archlinux.org>\r\n"); + if(intval(preg_replace('/^\([0-9]{3}\).*/', '\1',
fgets($smtp))) !== 250) {
Same here.
+ smtp_close($smtp); + return false; + } + + fwrite($smtp, "RCPT TO: <$addy>\r\n"); + $code = intval(preg_replace('/^\([0-9]{3}\).*/', '\1',
fgets($smtp)));
Same here.
+ /** + * 250 = success + * 451 or 452 = address got greylisted but another error occured + * so assume ok + */ + if($code !== 250 && $code !== 451 && $code !== 452) {
What about 251? What about 450?
251 should be taken in then 551 comes in the picture and both would require this function to be run with the returned <forward-path> when we follow the forward-path stuff then we could run out of time if the following ones are also forwarding
450 is considered an error so i would let it fail there
This is what I don't like about this patch... There's a dozens of
corner
cases to consider and I don't think this is the right place to check for these.
+ smtp_close($smtp); + return false; + } + + smtp_close($smtp); + return true; +} + +# close smtp conneciton +# +function smtp_close(&$smtp) { + fwrite($smtp, "RSET\r\n"); + fwrite($smtp, "QUIT\r\n"); + fclose($smtp); }
# a new seed value for mt_srand() -- 1.7.9.1
-- Ike
i'll send a new patch -- Ike
On 20.03.2012 10:36, BlackEagle wrote:
- check if the format is valid - go and connect to the smtp server of the given domain and verify if the given email exists there
Good idea, but I've seen mail servers accepting all recipients during the SMTP session, but later sending a DSN because the user didn't exist. That means we should still implement verification emails. Having this check is still good though.
--- web/lib/aur.inc.php | 75 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 files changed, 74 insertions(+), 1 deletions(-)
diff --git a/web/lib/aur.inc.php b/web/lib/aur.inc.php index c662b80..3fc0a14 100644 --- a/web/lib/aur.inc.php +++ b/web/lib/aur.inc.php @@ -80,7 +80,80 @@ function check_sid($dbh=NULL) { # verify that an email address looks like it is legitimate # function valid_email($addy) { - return (filter_var($addy, FILTER_VALIDATE_EMAIL) !== false); + // check against RFC 3696 + if(filter_var($addy, FILTER_VALIDATE_EMAIL) === false) { + return false; + } + + // check dns for mx, a, aaaa records + list($local, $domain) = explode('@', $addy); + if(! (checkdnsrr($domain, 'MX') || checkdnsrr($domain, 'A') || checkdnsrr($domain, 'AAAA'))) {
Please display an error message if this fails.
+ return false; + } + + // get mx records and check full email address + $mxlist = array(); + $mxweight = array(); + getmxrr($domain, $mxlist, $mxweight);
http://php.net/getmxrr#refsect1-function.getmxrr-notes states: "This function should not be used for the purposes of address verification. Only the mailexchangers found in DNS are returned ..."
+ $mx = array_combine($mxweight, $mxlist); + ksort($mx); + + //smtp_test_email($addy, current($mx)); + foreach($mx as $prio => $mxsrv) { + if(smtp_test_email($addy, $mxsrv) === true) { + return true; + } + }
Since this will eventually return false negatives, please display all SMTP sessions if you get here. This will help users understand what went wrong so they can fix it more easily.
+ + return false; +} + +# verify that an email address exists on the smtp server +# +function smtp_test_email($addy, $mxsrv) { + if(($smtp = fsockopen($mxsrv, 25)) === false) { + return false; + } + + if(intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp))) !== 220) { + smtp_close($smtp); + return false; + } + + fwrite($smtp, "HELO $mxsrv\r\n");
HELO should send the client name, not the server name.
+ if(intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp))) !== 250) { + smtp_close($smtp); + return false; + } + + fwrite($smtp, "MAIL FROM: <mailtest@archlinux.org>\r\n"); + if(intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp))) !== 250) { + smtp_close($smtp); + return false; + } + + fwrite($smtp, "RCPT TO: <$addy>\r\n"); + $code = intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp))); + /** + * 250 = success + * 451 or 452 = address got greylisted but another error occured
s/got/not/?
+ * so assume ok + */ + if($code !== 250 && $code !== 451 && $code !== 452) {
I'm not sure if rejecting greylisted adresses is a good idea because that means the users have to resubmit the form after 5-30 minutes. You should ask them to confirm that their address is correct (typo catching) and then ignore all temporary errors (all 4xx codes). If you get a 5xx it's fine to abort. If you want to make sure the address really exists, you have to implement verification mails. Just checking return codes for RCPT TO is not enough.
+ smtp_close($smtp); + return false; + } + + smtp_close($smtp); + return true; +} + +# close smtp conneciton +# +function smtp_close(&$smtp) { + fwrite($smtp, "RSET\r\n"); + fwrite($smtp, "QUIT\r\n"); + fclose($smtp); }
# a new seed value for mt_srand()
-- Florian Pritz
2012/3/20 Florian Pritz <bluewind@xinu.at>
On 20.03.2012 10:36, BlackEagle wrote:
- check if the format is valid - go and connect to the smtp server of the given domain and verify if the given email exists there
Good idea, but I've seen mail servers accepting all recipients during the SMTP session, but later sending a DSN because the user didn't exist. That means we should still implement verification emails. Having this check is still good though.
--- web/lib/aur.inc.php | 75 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 files changed, 74 insertions(+), 1 deletions(-)
diff --git a/web/lib/aur.inc.php b/web/lib/aur.inc.php index c662b80..3fc0a14 100644 --- a/web/lib/aur.inc.php +++ b/web/lib/aur.inc.php @@ -80,7 +80,80 @@ function check_sid($dbh=NULL) { # verify that an email address looks like it is legitimate # function valid_email($addy) { - return (filter_var($addy, FILTER_VALIDATE_EMAIL) !== false); + // check against RFC 3696 + if(filter_var($addy, FILTER_VALIDATE_EMAIL) === false) { + return false; + } + + // check dns for mx, a, aaaa records + list($local, $domain) = explode('@', $addy); + if(! (checkdnsrr($domain, 'MX') || checkdnsrr($domain, 'A') || checkdnsrr($domain, 'AAAA'))) {
Please display an error message if this fails.
+ return false; + } + + // get mx records and check full email address + $mxlist = array(); + $mxweight = array(); + getmxrr($domain, $mxlist, $mxweight);
http://php.net/getmxrr#refsect1-function.getmxrr-notes states: "This function should not be used for the purposes of address verification. Only the mailexchangers found in DNS are returned ..."
+ $mx = array_combine($mxweight, $mxlist); + ksort($mx); + + //smtp_test_email($addy, current($mx)); + foreach($mx as $prio => $mxsrv) { + if(smtp_test_email($addy, $mxsrv) === true) { + return true; + } + }
Since this will eventually return false negatives, please display all SMTP sessions if you get here. This will help users understand what went wrong so they can fix it more easily.
+ + return false; +} + +# verify that an email address exists on the smtp server +# +function smtp_test_email($addy, $mxsrv) { + if(($smtp = fsockopen($mxsrv, 25)) === false) { + return false; + } + + if(intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp))) !== 220) { + smtp_close($smtp); + return false; + } + + fwrite($smtp, "HELO $mxsrv\r\n");
HELO should send the client name, not the server name.
+ if(intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp))) !== 250) { + smtp_close($smtp); + return false; + } + + fwrite($smtp, "MAIL FROM: <mailtest@archlinux.org>\r\n"); + if(intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp))) !== 250) { + smtp_close($smtp); + return false; + } + + fwrite($smtp, "RCPT TO: <$addy>\r\n"); + $code = intval(preg_replace('/^\([0-9]{3}\).*/', '\1', fgets($smtp))); + /** + * 250 = success + * 451 or 452 = address got greylisted but another error occured
s/got/not/?
got
+ * so assume ok + */ + if($code !== 250 && $code !== 451 && $code !== 452) {
I'm not sure if rejecting greylisted adresses is a good idea because that means the users have to resubmit the form after 5-30 minutes. You should ask them to confirm that their address is correct (typo catching) and then ignore all temporary errors (all 4xx codes). If you get a 5xx it's fine to abort.
the above does allow the greylisted accounts
If you want to make sure the address really exists, you have to implement verification mails. Just checking return codes for RCPT TO is not enough.
+ smtp_close($smtp); + return false; + } + + smtp_close($smtp); + return true; +} + +# close smtp conneciton +# +function smtp_close(&$smtp) { + fwrite($smtp, "RSET\r\n"); + fwrite($smtp, "QUIT\r\n"); + fclose($smtp); }
# a new seed value for mt_srand()
-- Florian Pritz
then you should just drop this check and only do the mx, a, ... check for the obvious ones and then send the verification mail for full confirmation -- Ike
participants (5)
-
BlackEagle
-
Florian Pritz
-
Ike Devolder
-
Lukas Fleischer
-
Slavi Pantaleev