views:

129

answers:

4

Does anyone have good suggestions for displaying "max length exceeded" errors to a user when a single character doesn't equal one byte?

I'm at a loss for words, but I found a quote that's more eloquent:

If the buffer runs over by three bytes, what do you tell the user? Three bytes could be one, two, or three characters that the user needs to trim. Depending on which characters they trim, the result might still be too long. And recall that the user's perception of "a character" is probably closer to a grapheme or grapheme cluster than to a character. So they might delete too many characters without realizing it. Finally, if the buffer limit is small (like 10 or 20), some languages like Chinese will be severely restricted on the number of characters permitted.

A couple of constraints I'm under is that its a form-driven website and the underlying database columns sizes can't change (the quote-page suggests having a 40byte buffer and enforcing a 10character limit).

+2  A: 

The obvious correct answer is to not limit the text length.

But if you can't tell the user how many characters they have to play with, don't. Simply tell them when the string is too long. Keep track of how many bytes the current string would require, and if that is above your limit, enable a warning message for the user.

jalf
+4  A: 

My favorite way to solve this problem is to highlight the portion of the input that exceeds the maximum length. This provides a visual cue as to which part makes it "too long", without having to get into the specifics of how many bytes or characters it was.

If you can use Javascript (for example, if you don't need to meet 508 standards), I also like monitoring the length of the field and alerting the user when it's too long (while still doing server-side validation, of course).

If you don't want to get into complex CSS inside the input field, you can just reproduce the bad input below the field and highlight it there.

John Feminella
A: 

Just thinking out loud... why not be less specific: "maximum length exceeded by N" (eg. "maximum length exceeded by 4"). You don't tell the user what the max length is... just that they are N over it. And you don't tell the user what N represents (bytes)... when they see the message "too long by 3"... they will remove at least 3 characters (even though they could be removing 9 actual bytes).

I figure there just no way to explain to users why certain "characters" require multiple bytes that has a high probability of not confusing them.

A: 

Good question. Not sure there's a good answer, other than changing the schema to use Unicode characters instead of bytes. For example in SQL Server with NVARCHAR or MySQL with UTF-8 collations columns are limited by character length. That's pushing it a bit regarding “column lengths can't change” of course, even if they're technically the same ‘length’.

For what it's worth, East Asian users will be used to the idea that a character isn't a byte, because there's a long-standing tradition of ‘half-width’ latin characters taking up half as much storage and screen space as the Chinese ideographs.

You can't really generally expect anyone to grok UTF-8 byte numbers though. Perhaps at the client side you could do it purely visually, using an ‘amount used’ bar instead of a number of bytes:

<style type="text/css">
    .field { width: 12em; }
    .field input { width: 100%; }
    .field input { box-sizing: border-box; -moz-box-sizing: border-box; -ms-box-sizing: border-box; -webkit-box-sizing: border-box; -khtml-box-sizing: border-box; }
    .indicator { background: blue; height: 5px; }
    .indicator-over { background: red; height: 5px; }
</style>

<div class="field">
    <input type="text" name="pwd" class="limited-12">
</div>

<script type="text/javascript">
    function limitInput(element, limit) {
        var indicator= document.createElement('div');
        element.parentNode.insertBefore(indicator, element.nextSibling);
        element.onchange=element.onkeyup= function() {
            var utf8= unescape(encodeURIComponent(element.value));
            indicator.className= utf8.length>limit? 'indicator-over' : 'indicator';
            var used= Math.min(utf8.length/limit, 1);
            indicator.style.width= Math.floor(used*100)+'%';
        }
        element.onchange();
    }

    var inputs= document.getElementsByTagName('input');
    for (var i= inputs.length; i-->0;)
        if (inputs[i].className.substring(0, 8)=='limited-')
            limitInput(inputs[i], parseInt(inputs[i].className.substring(8)));
</script>
bobince