How to extract all urls from a text in ruby?
I tried some libs but it fails in some cases, whats the best way?
How to extract all urls from a text in ruby?
I tried some libs but it fails in some cases, whats the best way?
You can use regex and .scan()
string.scan(/(https?:\/\/([-\w\.]+)+(:\d+)?(\/([\w\/_\.]*(\?\S+)?)?)?)/)
You can get started with that regex and adjust it according to your needs.
What cases are failing?
According to the library regexpert, you can use
regexp = /(^$)|(^(http|https):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(([0-9]{1,5})?\/.*)?$)/ix
and then perform a scan
on the text.
EDIT: Seems like the regexp supports the empty string. Just remove the initial (^$)
and you're done