tags:

views:

134

answers:

2

Hi!

How to slice some text (html) string into number of pages to be possible read text as a book?

Thanks for suggestions.

+2  A: 

Assuming you are happy recognising only a subset of HTML markup without CSS (here I assume <p/><b/><i/><br/> tags only plus <font size=/> for font size changes (with other attributes ignored), <img> tags for images with all but src,width,height ignored and accurate width and height mandatory with all other tags/attributes ignored):-

  1. TidyLib seems to have an MIT license - http://tidy.sourceforge.net/#source

  2. SAX parse the XHTML output of TidyLib using NSXmlParser into a custom object model (unless you are exclusively using later versions of iPhone OS with public builtin DOM parser API in which case just use a DOM object model).

  3. Set up a state machine with a caret position at top left of page and initial font size and formatting, page number of 1, maximum height of glyphs/images in current line of zero, and empty list of page boundaries.

  4. For each run of text or image in object model, apply pre-ceding font size/format modifications, measure text using iPhone text measurement calls, reducing text length (trim to nearest space or hyphen) until it fits on current line, and resetting caret to line beginning and continuing for line wraps, and apply following font size and formatting changes. Over-count the width and height of text by some factor in cases where this is found to be required to prevent page overflow in the actual page rendering engine (UIWebView; you will have to experiment to see what the factors in the rendering engine are). Record page boundary in list.

  5. Convert objects between page boundaries to simplified XHTML for each page. You may wish to add some CSS at this point for example to format link colours. You will need to convert local references to anchors on another page to load the correct other page. Perhaps add page footer/header with page numbers (subtract size of these from page height in earlier steps).

  6. Save XHTML as set of files.

In essence this will work as long as the source HTML is specially prepared to use a subset of HTML for your app. Any old HTML will not do, though it might perhaps not be completely useless to give a rough idea for previews in some instances for some files.

The description above assumes you throw away formatting like ALIGN= and tables. It really is a very basic approach and will not reproduce complex pages as originally designed! It might well not suit you!

Perhaps the files should be pre-processed before reaching the iPhones in the field but if the iPhone OS / WebView line-wrapping/test positioning behaviour changes, the best position for page breaks may change. So you may need to cut your pages smaller than you think they need to be to allow for some unexpected growth when the rendering engine changes. Hmm. Perhaps not an easy task!

I haven't even tried to analyse HTML tables... HTML is of course, in its non-restricted full glory enormously probably unmanageably complex.

martinr
Thanks for great explanation!
sashaeve
A: 

I need help regarding the same?

ashish
any sample example for slicing the text to view in iphone as a book pages.
ashish