views:

56

answers:

3

All web developers run into this problem when the amount of data in their project grows, and I have yet to see a definitive, intuitive best practice for solving it. When you start a project, you often create forms with tags to help pick related objects for one-to-many relationships.

For instance, I might have a system with Neighbors and each Neighbor belongs to a Neighborhood. In version 1 of the application I create an edit user form that has a drop down for selecting users, that simply lists the 5 possible neighborhoods in my geographically limited application.

In the beginning, this works great. So long as I have maybe 100 records or less, my select box will load quickly, and be fairly easy to use. However, lets say my application takes off and goes national. Instead of 5 neighborhoods I have 10,000. Suddenly my little drop-down takes forever to load, and once it loads, its hard to find your neighborhood in the massive alphabetically sorted list.

Now, in this particular situation, having hierarchical data, and letting users drill down using several dynamically generated drop downs would probably work okay. However, what is the best solution when the objects/records being selected are not hierarchical in nature? In the past, of done this with a popup with a search box, and a list, but this seems clunky and dated. In today's web 2.0 world, what is a good way to find one object amongst many for ones forms?

I've considered using an Ajaxifed search box, but this seems to work best for free text, and falls apart a little when the data to be saved is just a reference to another object or record.

Feel free to cite specific libraries with generic solutions to this problem, or simply share what you have done in your projects in a more general way

+2  A: 

I think an auto-completing text box is a good approach in this case. Here on SO, they also use an auto-completing box for tags where the entry already needs to exist, i.e. not free-text but a selection. (remember that creating new tags requires reputation!)

I personally prefer this anyways, because I can type faster than select something with the mouse, but that is programmer's disease I guess :)

mnemosyn
A: 

Auto-complete is usually the best solution in my experience for searches, but only where the user is able to provide text tokens easily, either as part of the object name or taxonomy that contains the object (such as a product category, or postcode).

However this doesn't always work, particularly where 'browse' behavior would be more suitable - to give a real example, I once wrote a page for a community site that allowed a user to send a message to their friends. We used auto-complete there, allowing multiple entries separated by commas.

It works great when you know the names of the people you want to send the message to, but we found during user acceptance that most people didn't really know who was on their friend list and couldn't use the page very well - so we added a list popup with friend icons, and that was more successful.

(this was quite some time ago - everyone just copies Facebook now...)

Sam
A: 

Different methods of organizing large amounts of data:

  • Hierarchies
  • Spatial (geography/geometry)
  • Tags or facets

Different methods of searching large amounts of data:

  • Filtering (including autocomplete)
  • Sorting/paging (alphabetically-sorted data can also be paged by first letter)
  • Drill-down (assuming the data is organized as above)
  • Free-text search

Hierarchies are easy to understand and (usually) easy to implement. However, they can be difficult to navigate and lead to ambiguities. Spatial visualization is by far the best option if your data is actually spatial or can be represented that way; unfortunately this applies to less than 1% of the data we normally deal with day-to-day. Tags are great, but - as we see here on SO - can often be misused, misunderstood, or otherwise rendered less effective than expected.

If it's possible for you to reorganize your data in some relatively natural way, then that should always be the first step. Whatever best communicates the natural ordering is usually the best answer.

No matter how you organize the data, you'll eventually need to start providing search capabilities, and unlike organization of data, search methods tend to be orthogonal - you can implement more than one. Filtering and sorting/paging are the easiest, and if an autocomplete textbox or paged list (grid) can achieve the desired result, go for that. If you need to provide the ability to search truly massive amounts of data with no coherent organization, then you'll need to provide a full textual search.

If I could point you to some list of "best practices", I would, but HID is rarely so clear-cut. Use the aforementioned options as a starting point and see where that takes you.

Aaronaught