[-A-Z0-9+&@#\/%?=~_|!:,.;]*
appears to be slurping up most of the url, so we need to jam the .gov and .edu in here somewhere. The quickest solution would be:
[-A-Z0-9+&@#\/%?=~_|!:,.;]+(\.gov|\.edu)[-A-Z0-9+&@#\/%?=~_|!:,.;]*
However, this will match a url like: http://www.example.com/evil.gov/test.html
To fix this, we can take out the /
that it is matching before the top level domain:
[-A-Z0-9+&@#%?=~_|!:,.;]+(\.gov|\.edu)[-A-Z0-9+&@#\/%?=~_|!:,.;]*
Or, in closing, we have:
/(\b(https?|ftp):\/\/[-A-Z0-9+&@#%?=~_|!:,.;]+(\.gov|\.edu)[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|]?)/
Due to the problem that it doesn't match example.gov, I added a ?
to the last token.
Damn that is ugly.