views:

137

answers:

4

I have a webapplication, developed and ready to be deployed. The web part of it was designed using M$ FrontPage. None of the developers cared about proprietary weird tags that FrontPage inserts into HTML. I don't remember tags on top of my head, but I remember seeing tags such as <webbot> and etc. Now, my client doesn't want to see a bunch of useless tags obscuring HTML when a view source is done. This is not good from a application maintenance perspective too.

I tried googling for tools that would remove these tags from html without unknown side effects and I haven't really found anything useful. Has anyone dealt with this kind of problem before? If you did, did you use any tool for this? or Did you write your own regex based replace utility or something?

Please share your thoughts on this.

+4  A: 

The final solution to this problem is:

Do not use FrontPage!

I think the reason for not finding any conversion tools is that almost every developer that would care enough to filter the MS-specific tags, has moved on to another editor.

If it is important enough for your client that the source looks reasonably clean, it should definitely be important enough for your fellow developers.

Jacco
Seriously, DON'T use it! This is not 1997 anymore!
Natrium
@Natrium :-D love the sarcasm. Wonder why M$ produces this piece of beep with every new version of office. This a common scenario in large scale service companies.. that do offshoring - a bunch of grads with irrelevant experience to the field of computer science are hired to do a below par job. <br><br>Now, having said that, what do you think are my options other than going back in time and not using FrontPage?
Jay
Couldn't agree more, but doesn't actually help the OP
annakata
What piece of beep? They no longer produce FrontPage, and I don't think SharePoint Designer (the successor) puts those tags into HTML anymore without being asked to do it.
John Saunders
@John I wasn't aware of the fact that they no longer produce FrontPage. Isn't SharePoint Designer a CMS?
Jay
@John piece of beep : you may substitute your favorite expletive in beep's place, I tried to censor it :)
Jay
+3  A: 

For an online solution, you should check out Webmaster-toolkit's Frontpage Code Cleaner.

Abinadi
@Abinadi thank you for the link.
Jay
+1  A: 

You can remove the FP proprietary tags. I used my own regex to remove starting and ending garbage tags: <\?xx[^>]*> change 'xx' to the tag you are removing.

Are you breaking totally away from FrontPage? If the site is edited in page view, FP will put the tags back.

Also FP likes to control everything and writes a _vti_cnf file for each file it uploads. It gets testy if you ftp from a program that is not FP and that file is missing (especially if you are using FP extensions).

Make sure you put in a DOCTYPE - I don't think FP does that automatically.

Emily
@Emily Thank for your reponse. Yes, I am breaking away from FrontPage totally. I am most likely not editing these pages back in FrontPage.
Jay
+1  A: 

HTML Tidy will do a wonderful job of cleaning up just about any mess you can find.

Scott
@Scott Tried out HTML Tidy. It certainly isn't very user friendly to begin with. It badly needs a GUI.
Jay
@Jay I rarely use the program directly. I usually use Tidy through Notepad++ which comes with it as a plugin and a simple menu interface. There are other programs as well though they are not coming to mind at the moment.
Scott
kool! didn't know that a plugin existed for this. Will give it a shot.. thanks for the suggestion Scott.
Jay