views:

647

answers:

8

How can I split the content of a HTML file in screen-sized chunks to "paginate" it in a WebKit browser?

Each "page" should show a complete amount of text. This means that a line of text must not be cut in half in the top or bottom border of the screen.

Edit

This question was originally tagged "Android" as my intent is to build an Android ePub reader. However, it appears that the solution can be implemented just with JavaScript and CSS so I broadened the scope of the question to make it platform-independent.

A: 

You could split the pages in separate XHTML files and store them in a folder. Eg: page01, page02. You can then render those pages one by one underneath each other.

Chaoz
And how can I know where to split based on the screen and font size?
hgpc
+11  A: 

Speaking from experience, expect to put a lot of time into this, even for a barebones viewer. An ePub reader was actually first big project I took on when I started learning C#, but the ePub standard is definitely pretty complex.

You can find the latest version of the spec for ePub here: http://www.idpf.org/specs.htm which includes the OPS (Open Publication Structure), OPF (Open Packaging Format), and OCF (OEBPS Container Format).

Also, if it helps you at all, here is a link to the C# source code of the project I started on:

http://drop.io/epubtest

It's not fleshed out at all; I haven't played with this for months, but if I remember correctly, just stick an ePub in the debug directory, and when you run the program just type some part of the name (e.g. Under the Dome, just type "dome") and it will display the details of the book.

I had it working correctly for a few books, but any eBooks from Google Books broke it completely. They have a completely bizarre implementation of ePub (to me, at least) compared to books from other sources.

Anyway, hopefully some of the structural code in there might help you out!

kcoppock
+1 for sharing your experience and code.
hgpc
+4  A: 

I recently attempted something similar to this and added some CSS styling to change the layout to horizontal instead of vertical. This gave me the desired effect without having to modify the content of the Epub in any way.

This code should work.

mWebView.setWebViewClient(new WebViewClient() {
    public void onPageFinished(WebView view, String url) {

        // Column Count is just the number of 'screens' of text. Add one for partial 'screens'
        int columnCount = Math.floor(view.getHeight() / view.getWidth())+1;

        // Must be expressed as a percentage. If not set then the WebView will not stretch to give the desired effect.
        int columnWidth = columnCount * 100;

        String js = "var d = document.getElementsByTagName('body')[0];" + 
            "d.style.WebkitColumnCount=" + columnCount + ";" + 
            "d.style.WebkitColumnWidth='" + columnWidth + "%';";
        mWebView.loadUrl("javascript:(function(){" + js + "})()");
    }
});

mWebView.loadUrl("file:///android_asset/chapter.xml");

So, basically you're injecting JavaScript to change the styling of the body element after the chapter has been loaded (very important). The only downfall to this approach is when you have images in the content the calculated column count goes askew. It shouldn't be too hard to fix though. My attempt was going to be injecting some JavaScript to add width and height attributes to all images in the DOM that don't have any.

Hope it helps.

-Dan

Dan Watling
+1 Couldn't make this to work, but I think it has pointed me in the right direction.
hgpc
+2  A: 

Maybe it would work to use XSL-FO. This seems heavy for a mobile device, and maybe it's overkill, but it should work, and you wouldn't have to implement the complexities of good pagination (e.g. how do you make sure that each screen doesn't cut text in half) yourself.

The basic idea would be:

  • transform the XHTML (and other EPUB stuff) to XSL-FO using XSLT.
  • use an XSL-FO processor to render the XSL-FO into a paged format that you can display on the mobile device, such as PDF (can you display that?)

I don't know whether there is an XSL-FO processor available for Android. You could try Apache FOP. RenderX (XSL-FO processor) has the advantage of having a paged-HTML output option, but again I don't know if it could run on Android.

LarsH
+2  A: 

Building on Dan's answer here is my solution for this problem, with which I was struggling myself until just now. (this JS works on iOS Webkit, no guarantees for android, but please let me know the results)

var desiredHeight;
var desiredWidth;
var bodyID = document.getElementsByTagName('body')[0];
totalHeight = bodyID.offsetHeight;
pageCount = Math.floor(totalHeight/desiredHeight) + 1;
bodyID.style.padding = 10; //(optional) prevents clipped letters around the edges
bodyID.style.width = desiredWidth * pageCount;
bodyID.style.height = desiredHeight;
bodyID.style.WebkitColumnCount = pageCount;

Hope this helps...

Engin Kurutepe
A: 

Not sure about device-specific solutions, but this one uses html, css, and javascript with jquery. The gist is to copy over the content from the original document sub sections at a time into a node. This node is encased in another node that acts as the boundary mask. After appending the content into the inner node, check it's height against the mask container. If the inner node's height exceeds the height of the mask, backtrack characters off of the content until the inner node fits.

This is by no means an elegant solution, and it's painfully slow, but it works. http://www.devqty.mezoka.com/

I manually set the height and width of both the original doc node as well as the pages, but these could easily be converted to whatever sizes you want or dynamically generated based on screen sizes, etc.

Brian Flanagan
Wanting to use a fixed-page-size book metaphor in the browser is like someone in 1912 wanting his Oldsmobile to make the sound of a horse's hoofs. "People are used to riding in a buggy". :-)
Tim
Not really. I prefer one tap to change a screenful of text than scrolling, specially if the text is long.
hgpc
If you're still interested, it occurred to me this morning that I could optimize this logic by taking advantage of the assumption that the different pages would all be of a consistent height. I updated the code and significantly improved the speed.
Brian Flanagan
A: 

You can look at http://www.litres.ru/static/OR/or.html?data=/static/trials/00/42/47/00424722.gur.html&art=424722&user=0&trial=1 but the code may be heavily obfuscated, so just use Firebug to inspect DOM.

If the link isn't working, comment - would give you fixed.

mhambra
A: 

There is several ways this could be done. If every line is in its own element all you have to do is to check if one of it's edges goes outside of the view (either the browsers, or the "book page").

If you want to know how many "pages" there is going to be in advance, just temporary move them into the view and get what line a page ends. This could potentially be slow because of that page reflow is needed for the browser to know where anything is.

Otherwise I think that you could use the HTML5 canvas element to measure text and / or draw text.

Some info on that here: https://developer.mozilla.org/en/Drawing_text_using_a_canvas http://uupaa-js-spinoff.googlecode.com/svn/trunk/uupaa-excanvas.js/demo/8_2_canvas_measureText.html

Frank