On Fri, Oct 01, 2010 at 06:23:06PM +0200, PyroPeter wrote:
> On 10/01/2010 05:52 PM, Lukas Fleischer wrote:
> >This won't match URLs like
> >"https://aur.archlinux.org/packages.php?O=0&K=" and an ampersand at the
> >end of an URL won't be converted correctly :/ I'll try to implement it a
> >more proper way the next days. Maybe I'll actually go with splitting
> >comments at link boundaries as you suggested before... :)
> Well, that's the problem. Which characters should belong to the end of
> the URL, and which should not? There could also be cases in which
> punctuation belongs to the URL. If punctuation is parsed as not
> belonging to the URL, there would be no way to post a working link
> to certain URLs. If punctuation is parsed as part of the URL, one could
> insert a space between the URL and the punctuation that should not
> belong to the URL. One should also consider that inserting an URL into
> a sentence looks horrible and is normally not done (by me, at least).
> About splitting at boundaries: Contrary to what I have said before,
> using regular expressions seems to be a valid and efficient way.
> (I thought you would have to escape tag-content and attributes in
> different ways (percent-encoding vs. html-entities). After reading
> the HTML4 specification I realized this is not the case, as content and
> attributes are both escaped using html-entities)
> Regards, PyroPeter
I didn't read the whole thread but as far as I understand you're
searching for a proper solution how to correctly find urls in comments.

John Gruber's Regex seems quite right for this:

Does this help?

Jan-Erik (badboy_)

