views:

96

answers:

3

I was reading this paper "Ropes: an Alternative to Strings" about ropes

alt text

[figure from the same paper]

and I was wondering if that is the data structure used by today's browsers to implement textboxes or not. Do we use ropes or some other data structures for this?

Are ropes used somewhere besides textboxes?


The previous title of my question somehow also meant that I wanted to know how the string "remembering" happens - as I type, I get suggestions. I have changed it now.

What I want to know is what data structure is used to store the string when I type it. Is it something simple like a char array or something complex like a rope?

+1  A: 

Most probably just use whatever text box is provided by the underlying OS/windowing system. I'd guess in at least most cases, that'll be a simple linear array for a text box -- most rarely hold anywhere close to the amount of data necessary for anything like a rope to really make sense.

Jerry Coffin
@Jerry Coffin So, the difference that we see, for example in a Firefox textbox vs a Google Chrome textbox is only in the way it is rendered?
Lazer
@Lazer: Well, I suppose it depends on what textbox you're talking about. I was thinking mostly of those built into the browser (e.g., for picking your home page). For something like a text field in an HTML form, they undoubtedly provide their own (e.g., both have spell checkers that OS-provided ones normally lack). In any of these cases, however, I'd guess the text is stored in a simple array.
Jerry Coffin
A: 

The problem (in a simple case) is finding all the strings containing some substring. Since it's not always searching on common works or even letters, my guess would be some kind of http://en.wikipedia.org/wiki/N-gram index. For example, for trigrams:

  1. For each string to index, find all (overalapping) subsequences of 3 characters (trigrams).
  2. For each subsequence, store a list of all strings it appears in. This is the index, and it's a map from trigram -> list of strings.
  3. If the user types in a keyword, find its trigrams, look them up in the index, and return the intersection of the corresponding lists of strings.

This is a quick way of returning the strings that may contain the word. For more precision the results can be filtered down to those that contain the whole substring.

Browsers can enhance this in various ways, for example if several words are typed they can do the search for each word and return the URLs that contain either.

Edmund
@Edmund: thats not what I meant. Please see the updated question.
Lazer
A: 

They use the prefix matching algorithm. A Trie (and advanced versions thereof) are the best ways to implement longest prefix matches.

Chrome Source : http://www.google.com/codesearch/p?hl=en#WT2nGdYBQUk/branches/chrome/chrome/src/cpp/include/chrome/browser/autocomplete/autocomplete.h&q=chromium%20lang:c++%20textbox&sa=N&cd=4&ct=rc&d=8

NB: I presumed you meant how they "remember" the text as you type.

If you mean how each text box contains a list of things you had typed earlier and shows it in a pop down - a list that gets appended to each time you "submit" that text is used.

viksit
@viksit: thats not what I meant. Please see the updated question.
Lazer