views:

101

answers:

3

I'm trying to write an embedded (NOT web, not enterprise) content management system in Java, with a focus on organization and ease of use and scalability to 100,000 or so items. The user & system should be able to create and define metadata items which can be associated with unique resources, to allow for searching.

For example, they can create a tag "ProjectName" which takes String values. Then they can tag a bunch of resources as belonging to projects "Take Over the World" or "Fix My Car." The tags are strongly typed, so a tag may store single or multiple string(s), integer(s), double(s), etc. Each tag type should have formatters and input validators to allow editing.

I've decided that it is important to abstract the storage model from the GUI, to allow for scalability; the obvious way to do this is to use data access objects (DAOs) for each resource. However, I can't figure out how to write DAOs that support a variable number of tags and will scale properly.

The problem is that resources need to behave both as tuples (for tabular viewing/sorting/filtering) and as (TagName,TagValue) maps. The GUI models may call these methods potentially thousands of times for each GUI update, so some notion of indexing would make it all work better. Unfortunately, the multiple tag types mean it'll be awkward unless I return everything as a generic Object and do a whole mess of "TagValue instanceof Type" conditionals.

I've looked into using reflection and Apache's DynaBeans, but coding this to work with GUI models looks just painful and awkward. Is there a better way to do this??? Some library or design pattern?

So, my question is, is there a better way? Some library or design pattern that would simply this whole thing?

A: 

Are you tied to using a relational database? It might be worthwhile to look into a document oriented database such as couchDB. It will give you the flexibility that you need to store any arbitrary strongly typed object that you want and give you the ability to query those objects as well. I believe there are some Java libraries for accessing couchDB also.

Andrew
I don't want to be strongly tied to any particular system, but I'm already considering using Apache Jackrabbit for a backing store. I mostly need a good DAO model that expresses a good generic interface.
BobMcGee
I am not familiar with Jackrabbit, but it looks like it implements the Java Content Repository spec. That spec should give you an api for generic Nodes, Properties etc. As for a DAO, it looks like Spring has some pretty good support for JCR and implementing a DAO using their framework:https://springmodules.dev.java.net/docs/reference/0.6/html/jcr.html
Andrew
+1  A: 

I assume from your question that a "resource" is an entity in your system that has some "tag" entities associated with it. If my assumption is correct, here's a vanilla DAO interface, let me know if this is what you're thinking:

public interface ResourceDAO {
    void store(Resource resource);
    void remove(Resource resource);
    List<Resource> findResources(QueryCriteria criteria);
    void addTagsToResource(Resource resource, Set<Tag> tags);
}

The idea here is that you would implement this interface for whatever data storage mechanism you have available, and the application would access it via this interface. Instances of implementation classes would be obtained from a factory.

Does this fit with what you're thinking?

The other aspect of the problem you mention is having to contend with multiple different TagTypes that require different behavior depending on the type (requiring "TagValue instanceof Type" conditionals). The Visitor pattern may handle this for you in an elegant way.

eqbridges
BobMcGee
+1  A: 

I don't think you should consider any of these properties as actual member variables. You should have a "Property" object that contains a property (which would be analogous to a member variable), and a "Collection" object that has collections of properties (which would be like a class).

Since these attributes and collections don't really have code associated with them, it would make no sense to implement them as objects (and would be a real pain in the butt)

Your attributes and collections need to hold ALL the data specific to them. For instance, if a field is eventually written to the database, it needs to have it's table name stored somewhere. If it needs to be written to the screen, that also needs to be stored somewhere.

Range/value checking can be "Added" to the attributes, so when you define what type of data an attribute is, you might have some text that says "MaxLength(12)" which would instantiate a class called MaxLength with the value 12, and store that class into the attribute. Whenever the attribute's value changes, the new value would be passed to each range checker that has been applied to this class. There can be many types of actions associated with the class.

This is just the base. I've designed something like this out and it's a good deal of work, but it's much simpler than trying to do it in a straight language.

I know that this seems like WAY too much work right now (it should if you actually get what I'm suggesting), but keep it in mind and eventually you'll probably go "Hmph, maybe that was worth a try after all".

edit (response to comment):

I thought about trying to work with the registry/key thing (we're still talking attribute value pairs), but it doesn't quite fit.

You are trying to fit DAOs into Java Objects. This is really natural, but I've come to see it as just a bad approach to solving the DAO/DTO problem. A Java Object has attributes and behaviors that act on those attributes. For the stuff you are doing, there are no behaviors (for instance, if a user creates an "Birthday" field, you won't be using object code to calculate his age because you don't really know what a birthday is).

So if you throw away having Objects and attributes, how would you store this data?

Let me go with a very simple first step (that is very close to the registry/tag system you mentioned):Where you would have used an object, use a hashtable. For your attribute names use keys, for the attribute values, use the value in the hashtable.

Now, I'll go through the problems and solutions I took to enhance this simple model.

Problem: you've lost Strong Typing, and your data is very free-format (which is probably bad)

Solution: Make a base class for "Attribute" to be used in the place of the value in the hashtable. Extend that base class for IntegerAttribute, StringAttribute, DateAttribute, ... Don't allow values that don't fit that type. Now you have strong typing, but it's runtime instead of compile time--probably okay since your data is actually DEFINED at runtime anyway.

Problem: Formatters and Validators

Solution: Have the ability to create a plug-in for your attribute base-class. You should be able to "setValidator" or "setFormatter" for any attribute. The validator/formatter should live with the attribute--so you probably have to be able to serialize them to the DB when you save the attribute.

The nice part here is that when you do "attribute.getFormattedValue()" on the attribute, it's pre-formatted for display. attribute.setValue() will automatically call the validator and throw an exception or return an error code if any of the validations fail.

Problem: How do I display these on the screen? we already have getFormatted() but where does it display on the screen? what do we use for a label? What kind of a control should edit this field?

Solution: I'd store all these things inside EACH attribute. (The order should be stored in the Class, but since that's a hashtable so it won't work--well we'll get to that next). If you store the display name, the type of control used to render this (text field, table, date,...) and the database field name, this attribute should have all the information it needs to interact with display and database I/O routines written to deal with attributes.

Problem: The Hashtable is a poor interface for a DAO.

Solution: This is absolutely right. Your hashtable should be wrapped in a class that knows about the collection of attributes it holds. It should be able to store itself (including all its attributes) to the database--probably with the aid of a helper class. It should probably be able to validate all the attributes with a single method call.

Problem: How to actually work with these things?

Solution: Since they contain their own data, at any point in your system where they interact (say with the screen or with the DB), you need an "Adapter".

Let's say you are presenting a screen to edit your data. Your Adapter would be passed a frame and one of your hashtable-based DTOs.

First it would walk through the list of attributes in order. It would ask the first attribute (say a string) what kind of control it wanted to use for editing (let's say a text field).

It would create a text field, then it would add a listener to the text field that would update the data, this binds your data to the control on the screen.

Now whenever the user updates the control, the update is sent to the Attribute. The attribute stores the new value, you're done.

(This will be complicated by the concept of an "OK" button that transfers all the values at once, but I would still set up each binding before hand and use the "OK" as a trigger.)

This binding can be difficult. I've done it by hand, once I used a toolkit called "JGoodies" that had some binding ability built in so that I didn't have to write each possible binding combination myself, but I'm not sure in the long-run it saved much time.

This is way too long. I should just create a DAO/DTO toolkit someday--I think Java Objects are not at all suited as DAO/DTO objects.

If you're still stumped, feel free to Email/IM me-- bill.kress at gmail..

Bill K
I'm having difficulty understanding what this means in terms of code, but it sounds good in theory. Please help me see if I understand your design correctly. Tag: a singleton class for each kind of tag, with validators and storage info associated. Can get values for each resource.Resource: stores Set<Tag>, where each Tag can be used to get the value for the resource.TagRegistry: collection of all defined tags, and factory for new ones.Is that right? I agree that it sounds like a lot of work, unless TagRegistry has very good factory methods. Might also require Reflection use.
BobMcGee
Thanks! Your design perfectly covers all the areas I wanted but wasn't able to make workable. I will probably have to use a simpler system for initial design (to have something for the boss to see in action), but will plan on using this sort of system in the end.
BobMcGee
Good luck. You might also look into Spring--I believe there are pieces of it that could help.
Bill K