views:

57

answers:

3

I am using an array which contains the results of a database-query, which is later formatted as html (for a webapplication) or as csv (for import in a spreadsheet). I want to attach information to the array-element which has some additional information about how the data of this element can be used.

For instance, array-element-data...

  • ... can be displayed as a link: then I want the link information attached. The code which creates html from the array can use it to create a link.
  • ... is a date of the form 2009-09-14: then I want to somehow flag it as being a date. If the usage is for a html-page, it can then be displayed somewhat more beautiful, e.g. Mo Sep 14 or Today; if the recipient is a csv it would be best to leave it.

Is there a best-practice way of solving this problem?

I did think of several possible solutions, but wanted to ask if someone already knows a "best practice". From possibly best to worst:

  1. Store each array-element as custom-created object (Date,Linkable,Text...), instead of having the array-element as text. Possibly provide a default .to_string() method.
  2. Make each array-element a hash, so that I could say a[5][7]['text'] or a[5][7]['link'].
  3. Make different versions of the array, e.g. textArray[5][7], linkArray[5][7]

Creating the html as a start and just using the text-version seems like a bad idea, as the appearance differs depending on the usage (e.g. 2009-09-14 vs Mo Sep 14).

Or am I just asking the wrong question?

A: 

Unless you specify a language, (1) and (2) are basically the same. An object or a hash, what's the difference in a dynamic programming language other than perhaps syntax? In Lua everything's a dictionary.

(1)/(2) are usually preferred to (3), since they generally make copying an element along with its meta-data much easier. When sorting, for instance.

So without being specific to a language / environment, best practice in the absence of any special conditions is to somehow combine the meta-data and the element, and deal in the resulting datatype. You could do this by defining a new class to contain both, defining one more more subclasses of your original element type, using a generic pair, using a general-purpose dictionary, or just storing the meta-data in the original object (which would the be obvious approach in, say, Javascript).

Steve Jessop
Thanks for the concise answer - although all of the answers did make some new point clear to me!
thomastiger
A: 

As a general advice it's best if the data contains no information at all on how to represent itself.

Instead the part of the application creating the representation should have these settings in a separate data structure. Think of it as a XML file and various XSLT files creating different representations.

But in cases where this is not possible or when you have to merge the two information into one data structure for the actual conversion, I followed this rule of thumb:

Don't be to clever and do what's most natural in your language!

  • In Java and Delphi I always used the "custom object" variant, because one gets certain advantages like compile time checking.
  • In PHP I always used hashes because it's the more PHP-ish style.

I've done "Solution 3" sometimes, but I always regretted it. These structures tend to become a maintenance nightmare and you'll most likely run into synchronization isses, from a data point of view and also from a coding point of view.

DR
"it's best if the data contains no information at all on how to represent itself" - I think a URL's OK, though. I do think that to_string starts to test the limits of reasonable responsibility for the object, if the string is going to be displayed to users. If nothing else, when you realise that rendering dates is language/locale-dependent, you're going to introduce dependencies which could be separated in some kind of formatter class.
Steve Jessop
I'd say that highly depends on the application's domain. In your and the OP's example, I'd also agree that it's OK.
DR
A: 

A common approach in web frameworks would be to map records onto objects: one record from the database is read into one object, so your result is an Array of Objects. For different tables you need different classes. This is one of the building blocks for the Model View Controller (MVC) pattern used in many web frameworks.

For example in Ruby on Rails the Table users is handled by a Class User. You create both using a scaffold.

ruby script\generate scaffold user lastname:string link:string joined:date

Date, Boolean, String, Text, Decimal, Integer are distinct datatypes here. Unfortunately URLs are not, so I have to use String for the link.

You can read users from the database like this:

@u = User.find(77)       # gives you one object
@list = User.find(:all)  # gives you an array of User-objects

the attribute of the user object have the correct types to work with dates, numbers, etc:

if 100.days.ago < @u.joined then ....

Logic that is inherent to the data is implemented in the User class.

The list of users may be displayed in HTML using a view like this:

  <h1>Listing Users</h1>
  <table>
    <tr>
      <th>Lastname</th>
      <th>Link</th>
      <th>Joined on</th>
    </tr>
  <% @list.each do |user| %>
    <tr>
      <td><%=h user.lastname %></td>
      <td><%= link_to "Homepage", user.link %></td>
      <td><%=h user.joined %></td>
    </tr>
  <% end %>
  </table>

Logic that is inherent to displaying the data is implemented in the view(s). The knowledge which attribute of the object is to be treated as a link or a normal text resides in the view, not in the object itself.

Displaying / Outputting the same data as cvs is done by creating a cvs-view.

bjelli