views:

126

answers:

2

I am doing pagination in javascript. This is typographic pagination, not chopping up database results. For the most part it works, but I have run into a heisenberg issue where I cannot quite measure text without affecting it.

I am not trying to measure text before it is rendered. I want the actual position it shows up at on screen, so I can paginate to where it is naturally wrapped. I am measuring the vertical position of characters, not the horizontal width of strings. The way I do this is similar to this answer in that I am applying a style to a block of text, then measuring the position of the newly created span. If the span does not reach the end of the page, I clear it and make a new span in a linear search.

The problem is that the anti-aliased sub-pixel text layout is different when the span is applied. In rare cases, this causes the text to wrap differently when I measure it. I have only seen this when wrapping at a hyphen, and I assume it would not happen when wrapping at white space.

As a concrete example, "prepared-he" is the string I am having trouble with. When I measure up to "prepare" it appears, as expected, to be within the current page. When I measure "prepared" the whole phrase wraps down to the next line, moving it to the next page, so it looks like the "d" is the character to break at. I break the text between "prepare" and "d-he" and that is wrong. Trying to evaluate individual characters opens a whole can of worms I would rather avoid. The wrapping changes because, with the new span, the line is 1 pixel wider.

A solution to my problem could either be a better way to measure text using javascript, or a way to wrap text in a new element without affecting layout.

I have tried setting margin-right:-1px for the class of the span being created to wrap the text. This had no noticeable effect.

I am doing this in a UIWebView on the iPhone. There are some measurement related calls that are available in normal WebKit that are not available here. For example, Range does not have getBoundingClientRect or support setting an offset other than 0 in setStart or setEnd.

Thank you

Edit:

I tried the suggestion from unomi of making the span dimensionless. It had no effect. The class I am giving the span has no rules and is just used for quick deletion.

I tried iterating backwards instead of forwards through the text. The wrapping errors showed up in different places but the overall problem remained.

The text is mostly paragraphs with some simple styling. I do not think the method I am using would work with tables or lists. Within the paragraphs, I apply the span to one character at a time.

I tried reducing the font size for the span. The wrapping rules seem to allow wrapping at a span even if it is within a word, so that replaces one set of errors with another.

+1  A: 

This is a bit of a cop-out, but do you really want to chop up a paragraph?
Wouldn't it improve readability to simply break at the first <p> that wanders off viewport?

Ok, so just to be clear, it sounds like you are testing the position of a span which is moved character by character through a text? If that is correct, and the issue you have is with breaking up words, why don't you simply jump from white space to white space (optionally including hyphens) rather than from character to character?

Keep 1 previous location and break at it when the current one is off viewport.

I guess before too much else is done, are we sure that we can't make that span truly dimensionless?

span.marker {
  border: 0px; padding: 0px; margin: 0px; 
  width:0px: overflow:hidden; height:0px;
}
unomi
Or the first `<br>`. It's not clear what formatting the text has.
Brock Adams
That is the method we are replacing. I thought it was fine, but it is not up to me. The text is mostly paragraphs.
drawnonward
@drawnonward: There are no more serious issues to do first? ;-)If you know: (1) The total offset of the last on-screen element; (2) Likewise for the 1st off-screen element; (3) Font-size, line-height, padding and margin... Then you ought to be able to guess close enough. Refine by starting known off-screen and refine one word at a time.
Brock Adams
Or maybe... In the last onscreen paragraph, wrap every last mother-love'n stretch of whitespace in a span and refine using those offsets.
Brock Adams
@Brock the problem with basing it on the text is that whitespace is not the only legal character to wrap on, and I have to support at least some unicode. I will try searching backwards instead of forwards.
drawnonward
@drawnonward, You don't want to page split at a hyphen, that's just rude. ;-) So, besides whitespace and -- MAYBE -- terminating punctuation (all of which can be spanned too), what else would you wrap on?
Brock Adams
I have to agree, this sounds like one of those things that even if you get right, will still turn out to be the wrong user experience.
unomi
A: 

I did not find an ideal solution. This is the solution I came up with.

I apply the measuring span to one character at a time. I found two cases where there were problems. Sometimes a word would end up being longer, and the word would wrap to the next line when being measured. Sometimes a word with a hyphen or similar character would split differently when being measured.

For the case of a whole word wrapping differently, I change the class of the measuring span to have a smaller font size. If the same character does not wrap to the next line when using a smaller font size, I ignore the measurement as invalid and continue searching.

For the case of a split word wrapping differently, I measure the same character with the previous character. If the span is not wrapped with two characters, then I assume the next character will wrap and break there.

These problems seem to arise because formatting changes the kerning between characters. When I am using a span to measure the position of a character, it is at a slightly different position because it starts on a pixel boundary and ignores the kerning to the previous character.

I did not try spanning the entire block of text instead of a single character. It would add some complexity and I suspect the same problems would crop up in a slightly different way.

drawnonward
So you are page wrapping in the middle of a word? E<page break>ven **if** i<page break>t is hyphenated cor-<page break>rectly, that would be most user-hostile. I'<page break>d be most annoyed. ;)
Brock Adams
No, the goal is to match the natural word wrapping used by the web browser. The problem is that using a span to measure the text changes the word size and wrapping rules, so it may break in the middle of a word while measuring. The best solution would be a way to measure text that does not change the wrapping rules or break words in the middle. The above solution works around the fact that measuring changes what is measured.
drawnonward