tags:

views:

51

answers:

2

Lets say we have some text:

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vivamus cursus vestibulum quam, et tristique nisi tristique ac. Nam ac risus vehicula tortor facilisis tincidunt. Aliquam at nisi vel arcu aliquet dignissim nec et massa. Curabitur vel magna eros, accumsan rutrum augue. Lorem ipsum http://subdomain-1.example.com/dir1 dolor sit amet, consectetur adipiscing elit. Nunc ut vehicula purus. Phasellus nunc diam, hendrerit in ultrices vitae, adipiscing ut odio. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Cras molestie felis nec diam sollicitudin placerat pellentesque metus dapibus. Aliquam ipsum ante, lacinia porta http://subdomain-2.example.com/dir2 faucibus non, porttitor at nunc. Quisque suscipit, urna sit amet rhoncus bibendum, elit mi rhoncus lorem, ac luctus lectus nunc in velit.

need c# function which finds all URLs and replaces domain name with given one lets say for ex example.com to stackoverflow.com, but everything else remain the same (subdomain, and the rest of url).

For example the text should look like this after replacing:

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vivamus cursus vestibulum quam, et tristique nisi tristique ac. Nam ac risus vehicula tortor facilisis tincidunt. Aliquam at nisi vel arcu aliquet dignissim nec et massa. Curabitur vel magna eros, accumsan rutrum augue. Lorem ipsum http://subdomain-1.stackoverflow.com/dir1 dolor sit amet, consectetur adipiscing elit. Nunc ut vehicula purus. Phasellus nunc diam, hendrerit in ultrices vitae, adipiscing ut odio. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Cras molestie felis nec diam sollicitudin placerat pellentesque metus dapibus. Aliquam ipsum ante, lacinia porta http://subdomain-2.stackoverflow.com/dir2 faucibus non, porttitor at nunc. Quisque suscipit, urna sit amet rhoncus bibendum, elit mi rhoncus lorem, ac luctus lectus nunc in velit.

+1  A: 

I think this works:

Regex r = new Regex("@(?<SCHEME>https?://)(?<SUBDOMAIN>([^.]+\.)*)example\.com(?<PATH>/.*)?");
string newText = r.Replace(text, "${SCHEME}${SUBDOMAIN}stackoverflow.com${PATH}");

I use named groups because they're easier to keep track of and read. The first is the scheme, http:// or https://, the second grabs the subdomain, and the last one grabs an optional path (as you might have http://foo.example.com or http://foo.example.com/ or http://foo.example.com/bar)

tghw
Does not work for `http://sub2.sub1.example.com`
Hogan
This might be the fix: `(?<SUBDOMAIN>[^.]+\.)*example\.com` etc
Hogan
Also, without my fix this fails too `http://example.com`
Hogan
@Hogan Needs to be in the group, but otherwise yeah, you're right. Fixed.
tghw
A: 

The regular expression you use should look something like:

s!(http[s]?://[\w\-]+)\.domain\.com([\w\d/]+)!$1.newdomain.org$2!gi

Note: you will have to rewrite this in C#'s notation.

ternaryOperator
This requires the old domain and new domain be on the same TLD.
tghw
does not handle https:
Hogan
I have changed it to address tghw and Hogan's points - note it is just a general example (you should never just use someone else's regex without checking/customizing anyway).
ternaryOperator