views:

884

answers:

6

Hi, I have a question on how best to get xhtml into excel. let the user edit it in excel and then get back to xhtml at the end.

The background is that I have a web app in which the texts are stored in xhtml. These I can export to an excel file with the html in the excel cell.

I can also reimport this cell back to xhtml but the problem I have is that for editing it is very difficult to use for a normal user as the html is simply text in the excel cell.

Is this the wrong approach and should I just use a custom app with multiple html editors but the users would prefer an excel document they can easily exchange and work on offline.

Here is a sample xml but it could be any xhtml which the user enters in the html editor (tinymce) with xhtml strict. I assume there is no html editor plugin for excel. I've never heard of one and haven't been able to find one as this would be the simplest solution..

Review

* A Bullet: The impact of
* A bullet 2: Test text

Heading

* Bullet: text
* Bullet: Text

Anybody have any ideas? Thanks, Crocked

A: 

A solution I once implemented for my own personal use was that I wrote a macro in excel that parsed the html file and populated the excel sheet with only the data. When i was done, the macro would read the data in the sheet and generate the html file.

My case however had a very simple html file, and i basically had to only parse for a very few tags.

However you might be able to find a solution/utility which makes it easier for you to do that.

Mostlyharmless
While that would work if I had a guaranteed xhtml structure in this case I can't control what will be in the xhtml (at least not currently) as the user can alter it as they see fit.
Crocked
The parsing the xhtml could be still a good idea though I could look to extract each tag as a seperate cell in excel. The only problem is that if they reimport it and the orig html structure has changed (which is why I want the excel to xhtml step) in the meantime it probably won't work as desired..
Crocked
Maybe I could store the tag information in the excel document somehow
Crocked
Yeah, it does get exponentially complex as the number of tags to be handled increases. What do you mean by "store the tag information in excel"
Mostlyharmless
A: 

Without seeing your specific documents, I can only select a general sort of solution.

If you can break down the document by tags, you can display the data to the end user and store the tag information in a hidden worksheet. So, if you have:

1-233

You could show 1, -2, and 33 in the first three cells of the visible worksheet and store td, td style="color:red", td in the matching cells on the invisible sheet. Then, you store the tag data out of sight, minimizing user confusion, and can retrieve it when you need to generate the xhtml file.

Jekke
I've added a sample html below but the end user can enter the xhtml freeform in the editor. Storing the structure in a hidden worksheet would work well if the text remains the same but if they want to add new lines or cells to the text I need to interpret it somehow before saving it back
Crocked
A: 

Here is a sample xml but it could be any xhtml which the user enters in the html editor (tinymce) with xhtml strict. I assume there is no html editor plugin for excel. I've never heard of one..

Review
  • A Bullet: The impact of
  • A bullet 2: Test text

Heading

  • Bullet: text
  • Bullet: Text

Crocked
you should be updating your question or using comments on the answers you are responding to rather than using answers like a forum
Jeff Martin
Ok thanks. Sorry about that
Crocked
A: 

without seeing the complexity of the documents it may be hard but what about XSLT ? The excel itself if it is an .xlsx is xml. And either way it would export to XML. You could make an XSLT template that would transform it to the final xhtml document.

Jeff Martin
Unfortunately I can't use xlsx as I'm limited to office 2003. I've done quite a lot of work with xslt and it'd probably work quite well though with some finetuning if it was an option
Crocked
A: 

My suggestion would be to actually do the following:

xml to excel to xml

My reasoning for this is that

  1. User would be able to open the xml and actually get a basic understanding of the document (xhtml has useless tags they don't care about, like "html" or "head") since its clear and simple.
  2. Excel can open up xml files
  3. Easy for you to write a macro that exports the Excel Data to XML (lots of nice xml readers/writers)
  4. You can define an xslt template for that xml file, that would display it nicely as an html document.
  5. I noticed your comment about being limited to office 2003. You can define a xslt schema, and then if its simple enough, firefox or iexplorer will actually render it for you. You can check out browser support here: http://www.w3schools.com/xsl/xsl_browsers.asp
Anton
+1  A: 

If you're already familiar with XSLT, it sounds like that would be the best option. Although Excel 2003 doesn't use the .xlsx format, it does read (and save) what they call "SpreadsheetML", which is also XML. I've found that the easiest way to learn how to do the transforms to SpreadsheetML is just to open up a spreadsheet in Excel, format it however you like, then Save As->XML Spreadsheet (*.xml). Open up that XML file and you'll see all the styles Excel defined for your workbook, and you can then use those as a "template" to write XSLT that will transform your XHTML into that format.

For further reading and a decent walkthrough, look at Dive into SpreadsheetML (part 1 of 2) on MSDN.

Presumably you could then just use another XSL transform to pull out the data you need going from SpreadsheetML to XHTML.

Matt Winckler