I have a TextBox in a webpage that i'm using javascript to parse and modify to format for HTML. 90% of it works really well, the last main thing i'm trying to be able to support is copying and pasting from a word document. I got it mostly completely, i just am kinda stuck on finding list and wrapping them in a UL tag..
So, using regular expressions, i'd like to find the list in this text:
<p>paragraph goes here
<li>goes here<br/>
<li>list item 2<br/>
<li>list item 3<br/>
<p>another paragraph
and wrap the <li>
section with a <ul>
tag. my regexp foo isn't that good, can someone help?
----- update -----
While I appreciate all the feedback basically indicating that I need to start from scratch with this issue, I do not have the time to do that. I completely understand that regex is not the ideal way to handle HTML formatting, but how I am using it now, it will handle most of what my users are looking to do. I only need a subset of HTML tags, not a full HTML editor.
The source of my content will be a user copying and pasting from a word document (about 99.9% ) of the time. i use regex to insert HTML tags into plain text. for the lists, i find the bullet character MS word inserts into it's copied text and replace that with the <LI>
tag. I just want to make it more user friendly to wrap the <LI>
tags with a <UL>
tag.
I'll look into being able to end my tags properly, so.. assuming they're properly ended, what would be the regex to wrap my list items with a <ul>
tag?
thanks!