[aur-dev] [PATCH] Make external links in comments clickable (FS#20137).

newer
[aur-dev] [PATCH] Do not overwrite...

Lukas Fleischer

30 Sep 2010 30 Sep '10

3:22 p.m.

This is a bit hacky patch to make links in AUR comments clickable (fixes FS#20137 [1]). Huge parts of this code are ripped from the DokuWiki plugin that is also used in Flyspray. I didn't have any time to test it extensively so I'd suggest to do some more tests if this will be commited. [1] https://bugs.archlinux.org/task/20137 --- web/lib/aur.inc | 45 +++++++++++++++++++++++++++++++++++++++++ web/template/pkg_comments.php | 2 +- 2 files changed, 46 insertions(+), 1 deletions(-) diff --git a/web/lib/aur.inc b/web/lib/aur.inc index bd69c4c..b0cfdc8 100644 --- a/web/lib/aur.inc +++ b/web/lib/aur.inc @@ -494,3 +494,48 @@ function salted_hash($passwd, $salt) } return md5($salt . $passwd); } + +function parse_link($matches) +{ + $name = $url = $matches[0]; + + if(substr($url, 0, 3) == 'ftp' && (substr($url, 0, 6) != 'ftp://')) { + $url = 'ftp://'.$url; + } + elseif (substr($url, 0, 3) == 'www') { + $url = 'http://'.$url; + } + + $url = str_replace('&', '&', $url); + $url = str_replace('&', '&', $url); + $url = strtr($url, array('>' => '%3E', '<' => '%3C', '"' => '%22')); + + return '<a href="' . $url . '">' . $name . '</a>'; +} + +function parse_comment($comment) +{ + $schemes = array('http', 'https', 'ftp'); + $ltrs = '\w'; + $gunk = '\/\#~:.?+=&%@!\-'; + $punc = '.:?\-;,'; + $host = $ltrs . $punc; + $any = $ltrs . $gunk . $punc; + + $patterns = array(); + + foreach ($schemes as $scheme) { + $patterns[] = '(\b(?i)' . $scheme . '(?-i):\/\/[' . $any . ']+?(?=[' . $punc . ']*[^' . $any . ']))'; + } + + $patterns[] = '(\b(?i)www?(?-i)\.[' . $host . ']+?\.[' . $host . ']+?[' . $any . ']+?(?=[' . $punc . ']*[^' . $any . ']))'; + $patterns[] = '(\b(?i)ftp?(?-i)\.['. $host . ']+?\.[' . $host . ']+?[' . $any . ']+?(?=[' . $punc . ']*[^' . $any . ']))'; + + $regex = '/' . implode('|', $patterns) . '/msS'; + + $comment = htmlspecialchars($comment); + $comment = preg_replace_callback($regex, parse_link, $comment . "\n"); + $comment = nl2br($comment); + + return $comment; +} diff --git a/web/template/pkg_comments.php b/web/template/pkg_comments.php index 02171a0..2ca9bf0 100644 --- a/web/template/pkg_comments.php +++ b/web/template/pkg_comments.php @@ -20,7 +20,7 @@ while (list($indx, $carr) = each($comments)) { ?> ?></div> <blockquote class="comment-body"> <div> -<?php echo nl2br(htmlspecialchars($carr['Comments'])) ?> +<?php echo parse_comment($carr['Comments']) ?> </div> </blockquote> <?php -- 1.7.3

Show replies by date

PyroPeter

30 Sep 30 Sep

4:18 p.m.

On 09/30/2010 05:22 PM, Lukas Fleischer wrote:

...

This is a bit hacky patch to make links in AUR comments clickable (fixes FS#20137 [1]). Huge parts of this code are ripped from the DokuWiki plugin that is also used in Flyspray.

I didn't have any time to test it extensively so I'd suggest to do some more tests if this will be commited.

[1] https://bugs.archlinux.org/task/20137

--- web/lib/aur.inc | 45 +++++++++++++++++++++++++++++++++++++++++ web/template/pkg_comments.php | 2 +- 2 files changed, 46 insertions(+), 1 deletions(-)

diff --git a/web/lib/aur.inc b/web/lib/aur.inc index bd69c4c..b0cfdc8 100644 --- a/web/lib/aur.inc +++ b/web/lib/aur.inc @@ -494,3 +494,48 @@ function salted_hash($passwd, $salt) } return md5($salt . $passwd); } + +function parse_link($matches) +{ + $name = $url = $matches[0]; + + if(substr($url, 0, 3) == 'ftp'&& (substr($url, 0, 6) != 'ftp://')) { + $url = 'ftp://'.$url; + } + elseif (substr($url, 0, 3) == 'www') { + $url = 'http://'.$url; + } + + $url = str_replace('&','&', $url); + $url = str_replace('&', '&', $url);

What about the occurrences of "&(html-entity-code-here);" you produced the line before?

...

+ $url = strtr($url, array('>' => '%3E','<' => '%3C', '"' => '%22')); + + return '<a href="' . $url . '">' . $name .'</a>'; +} + +function parse_comment($comment) +{ + $schemes = array('http', 'https', 'ftp'); + $ltrs = '\w'; + $gunk = '\/\#~:.?+=&%@!\-'; + $punc = '.:?\-;,'; + $host = $ltrs . $punc; + $any = $ltrs . $gunk . $punc; + + $patterns = array(); + + foreach ($schemes as $scheme) { + $patterns[] = '(\b(?i)' . $scheme . '(?-i):\/\/[' . $any . ']+?(?=[' . $punc . ']*[^' . $any . ']))'; + } + + $patterns[] = '(\b(?i)www?(?-i)\.[' . $host . ']+?\.[' . $host . ']+?[' . $any . ']+?(?=[' . $punc . ']*[^' . $any . ']))'; + $patterns[] = '(\b(?i)ftp?(?-i)\.['. $host . ']+?\.[' . $host . ']+?[' . $any . ']+?(?=[' . $punc . ']*[^' . $any . ']))';

I am not that experienced with PHP, but this looks like the $patterns array got replaced instead of extended.

...

+ + $regex = '/' . implode('|', $patterns) . '/msS'; + + $comment = htmlspecialchars($comment);

Won't this render the next instruction useless if there are html-characters in a link?

...

+ $comment = preg_replace_callback($regex, parse_link, $comment . "\n"); + $comment = nl2br($comment); + + return $comment; +} diff --git a/web/template/pkg_comments.php b/web/template/pkg_comments.php index 02171a0..2ca9bf0 100644 --- a/web/template/pkg_comments.php +++ b/web/template/pkg_comments.php @@ -20,7 +20,7 @@ while (list($indx, $carr) = each($comments)) { ?> ?></div> <blockquote class="comment-body"> <div> -<?php echo nl2br(htmlspecialchars($carr['Comments'])) ?> +<?php echo parse_comment($carr['Comments']) ?> </div> </blockquote> <?php

Generally I would not make hostnames ("www.foo.tld") clickable. If people are not able to provide proper URL's, they have a serious problem. (there is also the technical argument that the hostname is not a good indicator for the kind of service the host provides.) Regards, PyroPeter -- freenode/pyropeter "12:50 - Ich drücke Return."

Lukas Fleischer

4:38 p.m.

New subject: [aur-dev] [PATCH] Make external links in comments clickable (FS#20137).

On Thu, Sep 30, 2010 at 06:18:24PM +0200, PyroPeter wrote:

...

...
+ $url = str_replace('&','&', $url); + $url = str_replace('&', '&', $url);

What about the occurrences of "&(html-entity-code-here);" you produced the line before?

Nothing? Any occurrence of an HTML entity code is correctly encoded as "&". People shouldn't be able to manually insert HTML entities in comments. The first line is actually even superfluous as I realized just now since ampersands should already have been replaced by htmlspecialchars() before at the time this line is executed (didn't check that before, this part of code has been extracted from the DokuWiki plugin).

...

...
+ $patterns[] = '(\b(?i)www?(?-i)\.[' . $host . ']+?\.[' . $host . ']+?[' . $any . ']+?(?=[' . $punc . ']*[^' . $any . ']))'; + $patterns[] = '(\b(?i)ftp?(?-i)\.['. $host . ']+?\.[' . $host . ']+?[' . $any . ']+?(?=[' . $punc . ']*[^' . $any . ']))';

I am not that experienced with PHP, but this looks like the $patterns array got replaced instead of extended.

Nope, it doesn't. Check [1].

...

...
+ $comment = htmlspecialchars($comment);

Won't this render the next instruction useless if there are html-characters in a link?

Nope. Links need to be escaped as well. Not sure what happens if a link contains quotes or "<"/">". This shouldn't happen too often tho.

...

Generally I would not make hostnames ("www.foo.tld") clickable. If people are not able to provide proper URL's, they have a serious problem. (there is also the technical argument that the hostname is not a good indicator for the kind of service the host provides.)

Why not? What if you explicitly want to link to a project's home page? It'll also just convert hostnames if they start with a "www" or "ftp" subdomain, so comments refering to domains in other ways won't be converted. [1] http://www.php.net/manual/de/language.types.array.php#language.types.array.s...

PyroPeter

6:56 p.m.

On 09/30/2010 06:38 PM, Lukas Fleischer wrote:

...

On Thu, Sep 30, 2010 at 06:18:24PM +0200, PyroPeter wrote:

...
...
+ $url = str_replace('&','&', $url); + $url = str_replace('&', '&', $url);

What about the occurrences of "&(html-entity-code-here);" you produced the line before?

Nothing? Any occurrence of an HTML entity code is correctly encoded as "&". People shouldn't be able to manually insert HTML entities in comments. The first line is actually even superfluous as I realized just now since ampersands should already have been replaced by htmlspecialchars() before at the time this line is executed (didn't check that before, this part of code has been extracted from the DokuWiki plugin).

Well, but you are encoding existing entities, that are not "&" as "&foo;". See the example below.

...

...
...
+ $patterns[] = '(\b(?i)www?(?-i)\.[' . $host . ']+?\.[' . $host . ']+?[' . $any . ']+?(?=[' . $punc . ']*[^' . $any . ']))'; + $patterns[] = '(\b(?i)ftp?(?-i)\.['. $host . ']+?\.[' . $host . ']+?[' . $any . ']+?(?=[' . $punc . ']*[^' . $any . ']))';

I am not that experienced with PHP, but this looks like the $patterns array got replaced instead of extended.

Nope, it doesn't. Check [1].

I see, "$var[] = foo" creates the array $var if necessary and appends foo.

...

...
...
+ $comment = htmlspecialchars($comment);

Won't this render the next instruction useless if there are html-characters in a link?

Nope. Links need to be escaped as well. Not sure what happens if a link contains quotes or "<"/">". This shouldn't happen too often tho.

Imo, you should split the message at the link boundaries. ( "foo ", "http://foo.bar.tld", " baz") Then you should encode the html-entities in all elements, wrap the links in <a>'s, and then join all that together. I can not think of a way to connect <a>-tags with proper encoding of the user input using just "normal" string functions like replace, or regexes. I would be happy if you could prove the opposite, and will help by providing you with input that breaks your system. == example 1 == input: "foo http://foo.tld/iLikeToUseApersands/foo&bar.html baz" If I am not mistaken, $regex would be "/http://foo.tld/iLikeToUseApersands/foo&bar.html/msS" (are the "/" correctly escaped? I will assume they are.) Then, $regex would be: "/http:\/\/foo\.tld\/iLikeToUseApersands\/foo&bar\.html/msS" $comment would be set by htmlspecialchars() to: "foo http://foo.tld/iLikeToUseApersands/foo&bar.html baz" => preg_replace_callback() would not match, as & got replaced.

...

...
Generally I would not make hostnames ("www.foo.tld") clickable. If people are not able to provide proper URL's, they have a serious problem. (there is also the technical argument that the hostname is not a good indicator for the kind of service the host provides.)

Why not? What if you explicitly want to link to a project's home page? It'll also just convert hostnames if they start with a "www" or "ftp" subdomain, so comments refering to domains in other ways won't be converted.

You can also link to a homepage using valid URL's. The additional "feature" may be nice, but makes the code more complex. It also trains users to omit the "http://" and produces more work for devs, as they all now have to parse this invalid hostname+path stuff. Unrelated: You seem to accept only a-zA-Z in hostnames? Or does PHP's \w include 0-9 and language-dependent letters? What about underscores? Why does the <a>'s content only include the Path of the URL? Regards, PyroPeter -- freenode/pyropeter "12:50 - Ich drücke Return."

Lukas Fleischer

11:05 p.m.

New subject: [aur-dev] [PATCH] Make external links in comments clickable (FS#20137).

On Thu, Sep 30, 2010 at 08:56:56PM +0200, PyroPeter wrote:

...

Well, but you are encoding existing entities, that are not "&" as "&foo;". See the example below.

Yep, and that's how it's supposed to be. There shouldn't be any entities that users put in the comments and that are not encoded.

...

I see, "$var[] = foo" creates the array $var if necessary and appends foo.

Correct.

...

Imo, you should split the message at the link boundaries. ( "foo ", "http://foo.bar.tld", " baz") Then you should encode the html-entities in all elements, wrap the links in <a>'s, and then join all that together.

Yes... That would be cleaner, but also way more complicated to implement and would require huge amounts of code for making links clickable.

...

== example 1 ==

input: "foo http://foo.tld/iLikeToUseApersands/foo&bar.html baz"

If I am not mistaken, $regex would be "/http://foo.tld/iLikeToUseApersands/foo&bar.html/msS" (are the "/" correctly escaped? I will assume they are.)

Then, $regex would be: "/http:\/\/foo\.tld\/iLikeToUseApersands\/foo&bar\.html/msS"

$comment would be set by htmlspecialchars() to: "foo http://foo.tld/iLikeToUseApersands/foo&bar.html baz"

=> preg_replace_callback() would not match, as & got replaced.

Why should it not work? preg_replace_callback() still matches if the URL contains a semicolon. This will be parsed and output a valid link (tested with current GIT version and patch applied).

...

You can also link to a homepage using valid URL's. The additional "feature" may be nice, but makes the code more complex. It also trains users to omit the "http://" and produces more work for devs, as they all now have to parse this invalid hostname+path stuff.

Hm, that's a question of taste. We'll let Loui decide :p

...

Unrelated: You seem to accept only a-zA-Z in hostnames? Or does PHP's \w include 0-9 and language-dependent letters? What about underscores?

"\w" in perl compatible regex includes all alphanumeric characters plus the underscore ("_").

...

Why does the <a>'s content only include the Path of the URL?

It doesn't. The "<a></a>"'s content contains excactly what the user typed (with special chars converted by htmlspecialchars()). Please don't just assume things but test your examples using a current GIT checkout with the patch applied in future.

PyroPeter

1 Oct 1 Oct

12:15 p.m.

On 10/01/2010 01:05 AM, Lukas Fleischer wrote:

...

Please don't just assume things but test your examples using a current GIT checkout with the patch applied in future.

I did not mean to offend you, and after applying the patch (which I should have done before sending the mails, you are right) your code in fact seems to work a lot better then I thought. While testing, I found a bug: Post this URL: http://foo.bar/<><>; It seems to trigger two bugs at once, first, the regex does not match whole URL, and second, the href is escaped twice. Regards, PyroPeter -- freenode/pyropeter "12:50 - Ich drücke Return."

PyroPeter

12:18 p.m.

On 10/01/2010 02:15 PM, PyroPeter wrote:

...

I did not mean to offend you, and after applying the patch (which I should have done before sending the mails, you are right) your code in fact seems to work a lot better then I thought.

While testing, I found a bug: Post this URL: http://foo.bar/<><>; It seems to trigger two bugs at once, first, the regex does not match whole URL, and second, the href is escaped twice. What I ment was the URL http://foo.bar/<><>; Also, s/whole/the whole/

Regards, PyroPeter -- freenode/pyropeter "12:50 - Ich drücke Return."

Lukas Fleischer

12:50 p.m.

New subject: [aur-dev] [PATCH] Make external links in comments clickable (FS#20137).

On Fri, Oct 01, 2010 at 02:15:41PM +0200, PyroPeter wrote:

...

I did not mean to offend you, and after applying the patch (which I should have done before sending the mails, you are right) your code in fact seems to work a lot better then I thought.

I didn't feel offended in any way, but reporting bugs that don't exist is just counterproductive and a waste of time.

...

While testing, I found a bug: Post this URL: http://foo.bar/<><>; It seems to trigger two bugs at once, first, the regex does not match whole URL, and second, the href is escaped twice.

I already said that there might be problems if the URL contains quotes or less-than/greater-than symbols in another mail [1]. This can be fixed by removing the first str_replace() (which I also proposed in the same mail) or by repeating the second str_replace() for """, "'", "<" and ">" (which might be even better from the perspective of security). However, I don't think such URLs will be a common use case. The second "bug" is expected behaviour, since punctuation marks at the end of URLs shouldn't be included in the URL itself (imagine someone putting a link at the end of a senctence). This is also how DokuWiki and Flyspray behave. If there really is an URL requiring a punctuation mark at the end of the URL (which there shouldn't be at all), this can be remarked in the comment itself. [1] http://mailman.archlinux.org/pipermail/aur-dev/2010-September/001263.html

Loui Chang

3 Oct 3 Oct

2:29 a.m.

New subject: [aur-dev] [PATCH] Make external links in comments clickable (FS#20137).

On Fri 01 Oct 2010 01:05 +0200, Lukas Fleischer wrote:

...

On Thu, Sep 30, 2010 at 08:56:56PM +0200, PyroPeter wrote:

...
You can also link to a homepage using valid URL's. The additional "feature" may be nice, but makes the code more complex. It also trains users to omit the "http://" and produces more work for devs, as they all now have to parse this invalid hostname+path stuff.

Hm, that's a question of taste. We'll let Loui decide :p

I generally agree with that.

PyroPeter

1 Oct 1 Oct

2:59 p.m.

On 09/30/2010 05:22 PM, Lukas Fleischer wrote:

...

This is a bit hacky patch to make links in AUR comments clickable (fixes FS#20137 [1]). Huge parts of this code are ripped from the DokuWiki plugin that is also used in Flyspray.

I didn't have any time to test it extensively so I'd suggest to do some more tests if this will be commited.

I fixed the </> bugs and changed the indention to tabs. I also changed the regex to one that accepts everything that starts with one of 'http', 'https' or 'ftp' followed by a colon ":", contains no whitespace and ends with a letter (\w) or a slash "/". It also parses hostnames starting with "www." and ending in a 2 to 5 letters long TLD consisting of only a-z. I have tested it with the comments at [0]. Regards, PyroPeter [0] http://aur.keks.selfip.org/packages.php?ID=298 --- diff --git a/web/lib/aur.inc b/web/lib/aur.inc index bd69c4c..7465fac 100644 --- a/web/lib/aur.inc +++ b/web/lib/aur.inc @@ -494,3 +494,24 @@ function salted_hash($passwd, $salt) } return md5($salt . $passwd); } + +function parse_link($matches) +{ + $name = $matches[0]; + $url = preg_replace('/^www\./', 'http://www.', $matches[0]); + return '<a href="' . $url . '">' . $name . '</a>'; +} + +function parse_comment($comment) +{ + $regex = '/' . str_replace('/', '\/', implode('|',array( + '\b(http|https|ftp):[^\s]*[\w\d/]', + '\bwww\.[^\s/]+\.[a-z]{2,5}' + ))) . '/S'; + + $comment = htmlspecialchars($comment); + $comment = preg_replace_callback($regex, 'parse_link', $comment); + $comment = nl2br($comment); + + return $comment; +} diff --git a/web/template/pkg_comments.php b/web/template/pkg_comments.php index 02171a0..2ca9bf0 100644 --- a/web/template/pkg_comments.php +++ b/web/template/pkg_comments.php @@ -20,7 +20,7 @@ while (list($indx, $carr) = each($comments)) { ?> ?></div> <blockquote class="comment-body"> <div> -<?php echo nl2br(htmlspecialchars($carr['Comments'])) ?> +<?php echo parse_comment($carr['Comments']) ?> </div> </blockquote> <?php -- freenode/pyropeter "12:50 - Ich drücke Return."

Lukas Fleischer

3:52 p.m.

New subject: [aur-dev] [PATCH] Make external links in comments clickable (FS#20137).

On Fri, Oct 01, 2010 at 04:59:22PM +0200, PyroPeter wrote:

...

I fixed the </> bugs and changed the indention to tabs. I also changed the regex to one that accepts everything that starts with one of 'http', 'https' or 'ftp' followed by a colon ":", contains no whitespace and ends with a letter (\w) or a slash "/". It also parses hostnames starting with "www." and ending in a 2 to 5 letters long TLD consisting of only a-z.

[...]

This won't match URLs like "https://aur.archlinux.org/packages.php?O=0&K=" and an ampersand at the end of an URL won't be converted correctly :/ I'll try to implement it a more proper way the next days. Maybe I'll actually go with splitting comments at link boundaries as you suggested before... :)

PyroPeter

4:23 p.m.

On 10/01/2010 05:52 PM, Lukas Fleischer wrote:

...

This won't match URLs like "https://aur.archlinux.org/packages.php?O=0&K=" and an ampersand at the end of an URL won't be converted correctly :/ I'll try to implement it a more proper way the next days. Maybe I'll actually go with splitting comments at link boundaries as you suggested before... :)

Well, that's the problem. Which characters should belong to the end of the URL, and which should not? There could also be cases in which punctuation belongs to the URL. If punctuation is parsed as not belonging to the URL, there would be no way to post a working link to certain URLs. If punctuation is parsed as part of the URL, one could insert a space between the URL and the punctuation that should not belong to the URL. One should also consider that inserting an URL into a sentence looks horrible and is normally not done (by me, at least). About splitting at boundaries: Contrary to what I have said before, using regular expressions seems to be a valid and efficient way. (I thought you would have to escape tag-content and attributes in different ways (percent-encoding vs. html-entities). After reading the HTML4 specification I realized this is not the case, as content and attributes are both escaped using html-entities) Regards, PyroPeter -- freenode/pyropeter "12:50 - Ich drücke Return."

Lukas Fleischer

4:40 p.m.

New subject: [aur-dev] [PATCH] Make external links in comments clickable (FS#20137).

On Fri, Oct 01, 2010 at 06:23:06PM +0200, PyroPeter wrote:

...

About splitting at boundaries: Contrary to what I have said before, using regular expressions seems to be a valid and efficient way. (I thought you would have to escape tag-content and attributes in different ways (percent-encoding vs. html-entities). After reading the HTML4 specification I realized this is not the case, as content and attributes are both escaped using html-entities)

Using regular expressions is an efficient way, but they should be applied before htmlspecialchars() or anything similar is applied. E.g. we could use preg_match() or preg_match_all() with PREG_OFFSET_CAPTURE to get the positions of all links, then call a function, that converts links and converts the stings as necessary, and convert the parts that don't contain any links separately using htmlspecialchars().

Jan-Erik Rediger

4:45 p.m.

New subject: [aur-dev] [PATCH] Make external links in comments clickable (FS#20137).

On Fri, Oct 01, 2010 at 06:23:06PM +0200, PyroPeter wrote:

...

On 10/01/2010 05:52 PM, Lukas Fleischer wrote:

...
This won't match URLs like "https://aur.archlinux.org/packages.php?O=0&K=" and an ampersand at the end of an URL won't be converted correctly :/ I'll try to implement it a more proper way the next days. Maybe I'll actually go with splitting comments at link boundaries as you suggested before... :)

Well, that's the problem. Which characters should belong to the end of the URL, and which should not? There could also be cases in which punctuation belongs to the URL. If punctuation is parsed as not belonging to the URL, there would be no way to post a working link to certain URLs. If punctuation is parsed as part of the URL, one could insert a space between the URL and the punctuation that should not belong to the URL. One should also consider that inserting an URL into a sentence looks horrible and is normally not done (by me, at least).

About splitting at boundaries: Contrary to what I have said before, using regular expressions seems to be a valid and efficient way. (I thought you would have to escape tag-content and attributes in different ways (percent-encoding vs. html-entities). After reading the HTML4 specification I realized this is not the case, as content and attributes are both escaped using html-entities)

Regards, PyroPeter -- freenode/pyropeter "12:50 - Ich drücke Return."

I didn't read the whole thread but as far as I understand you're searching for a proper solution how to correctly find urls in comments. John Gruber's Regex seems quite right for this: http://daringfireball.net/2010/07/improved_regex_for_matching_urls Does this help? Jan-Erik (badboy_)

Lukas Fleischer

3 Oct 3 Oct

7:51 p.m.

Comments are now split at link boundaries and links are converted separately. I find this to be a much cleaner way than re-converting comments that have already been converted using htmlspecialchars(). This also doesn't require any callback procedure. --- web/lib/aur.inc | 24 ++++++++++++++++++++++++ web/template/pkg_comments.php | 2 +- 2 files changed, 25 insertions(+), 1 deletions(-) diff --git a/web/lib/aur.inc b/web/lib/aur.inc index bd69c4c..a6292ca 100644 --- a/web/lib/aur.inc +++ b/web/lib/aur.inc @@ -494,3 +494,27 @@ function salted_hash($passwd, $salt) } return md5($salt . $passwd); } + +function parse_comment($comment) +{ + $url_pattern = '/(\b(?:https?|ftp):\/\/[\w\/\#~:.?+=&%@!\-;,]+?' . + '(?=[.:?\-;,]*(?:[^\w\/\#~:.?+=&%@!\-;,]|$)))/iS'; + + $matches = preg_split($url_pattern, $comment, -1, + PREG_SPLIT_DELIM_CAPTURE); + + $html = ''; + for ($i = 0; $i < count($matches); $i++) { + if ($i % 2) { + # convert links + $html .= '<a href="' . htmlspecialchars($matches[$i]) . + '">' . htmlspecialchars($matches[$i]) . '</a>'; + } + else { + # convert everything else + $html .= nl2br(htmlspecialchars($matches[$i])); + } + } + + return $html; +} diff --git a/web/template/pkg_comments.php b/web/template/pkg_comments.php index 02171a0..2ca9bf0 100644 --- a/web/template/pkg_comments.php +++ b/web/template/pkg_comments.php @@ -20,7 +20,7 @@ while (list($indx, $carr) = each($comments)) { ?> ?></div> <blockquote class="comment-body"> <div> -<?php echo nl2br(htmlspecialchars($carr['Comments'])) ?> +<?php echo parse_comment($carr['Comments']) ?> </div> </blockquote> <?php -- 1.7.3.1

Lukas Fleischer

25 Jan 25 Jan

10:52 a.m.

New subject: [aur-dev] [PATCH] Make external links in comments clickable (FS#20137).

On Sun, Oct 03, 2010 at 09:51:45PM +0200, Lukas Fleischer wrote:

...

Comments are now split at link boundaries and links are converted separately. I find this to be a much cleaner way than re-converting comments that have already been converted using htmlspecialchars(). This also doesn't require any callback procedure.

--- web/lib/aur.inc | 24 ++++++++++++++++++++++++ web/template/pkg_comments.php | 2 +- 2 files changed, 25 insertions(+), 1 deletions(-)

Pushed this.

5355

Age (days ago)

5472

Last active (days ago)

List overview

Download

15 comments

4 participants

participants (4)

Jan-Erik Rediger
Loui Chang
Lukas Fleischer
PyroPeter