I receive a block of code from db which occasionally contains urls, e.g, http://site.tld/lorem.ipsum/whatever
Now I want to turn this into nice clickable link for the user, with a helper method. Such as:
<a href="http://site.tld/lorem.ipsum/whatever">http://site.tld/lorem.ipsum/whatever</a>
Of course, anyone can do this, [^\s]+
does the trick. But the the obvious problem is that I if have a dot (.) for example, right after the URL, I don't want it to be included in the link. So we need to limit the URL to a number of characters, but we can't create a rule that matches chars that aren't that specific characters, since the dot I earlier mentioned, is a "url stopper" but it can also be contained in the URL.
My first guess what this:
(http\:\/\/[^\s]+)(\,|\.|\;|\:)?
which would be replaced as
<a href="$1">$1</a>$2
But it does not work, since the second variable container is optional, it seems to be preferable for those characters to be included in the first one, since anything is allowed there except the space character.
I really appreciate your help, but honestly, I don't want a gigantic rule found over the internet, that seems to work at the moment. I'm sure there's a cool way to obtain this. I have a decent understanding of regular expressions, but this scenario seems to be something I did not experience before. Or maybe I'm missing something, after all, it is past 3 AM.
Thanks!
Edit:
@Chirael clear it out for me, but here is my final solution:
(http\:\/\/[^\s]+?)(\,|\.|\;|\:)?(\s|$)
- I'm clearing the slashes because I'm using PHP
- I added more characters as "URL stoppers" in the second variable
- Since the first variable becomes "non-greedy", and the 2nd one is optional, if the 3rd one isn't specified the link will only contain the first char after "http://". But there was a problem when the URL was the last thing in the text, so now the 3rd variable can be either a space char or the end of the text.