views:

568

answers:

3

Given the following domain classes:

class Post {
    SortedSet tags
    static hasMany = [tags:Tag]
}

class Tag {
    static belongsTo = Post
    static hasMany = [posts:Post]
}

From my understanding so far, using a hasMany will result in hibernate SET mapping. However, in order to maintain uniqueness/order, Hibernate needs to load the entire set from the database and compare their hashes.

This could lead to a significant performance problem with adding and deleting posts/tags if their sets get large. What is the best way to work around this issue?

Thanks!

A: 

The ordering of the set is guaranteed by the Set implementation, ie, the SortedSet. Unless you use a List, which keeps track of indexes on the db, the ordering is server-side only.

If your domain class is in a SortedSet, you have to implement Comparable in order to enable the proper sorting of the set.

The question of performance is not really a question per se. If you want to access a single Tag, you should get it by its Id. If you want the sorted tags, well, the sort only makes sense if you are looking at all Tags, not a particular one, so you end up retrieving all Tags at once. Since the sorting is performed server-side and not db-side, there is really not much difference between a SortedSet and a regular HashSet in regards to Db.

Miguel Ping
Do you have any documentation/evidence that sorting is performed server-side for a SortedSet?
Robert Fischer
@Miguel, thanks for the info i'll dig deeper into this issue.
Walter
+1  A: 

There is no order ensured by Hibernate/GORM in the default mapping. Therefore, it doesn't have to load elements from the database in order to do the sorting. You will have your hands on a bunch of ids, but that's that extent of it.

See 19.5.2: http://www.hibernate.org/hib_docs/reference/en/html/performance-collections.html

In general, Hibernate/GORM is going to have better performance than you expect. Unless and until you can actually prove a real-world performance issue, trust in the framework and don't worry about it.

Robert Fischer
Thanks for the link. I agree that I should avoid premature optimization at this stage.
Walter
A: 

The Grails docs seems to be updated:

http://grails.org/doc/1.0.x/

In section 5.2.4 they discuss the potential performance issues for the collection types.

Here's the relevant section:

A Note on Collection Types and Performance

The Java Set type is a collection that doesn't allow duplicates. In order to ensure uniqueness when adding an entry to a Set association Hibernate has to load the entire associations from the database. If you have a large numbers of entries in the association this can be costly in terms of performance.

The same behavior is required for List types, since Hibernate needs to load the entire association in-order to maintain order. Therefore it is recommended that if you anticipate a large numbers of records in the association that you make the association bidirectional so that the link can be created on the inverse side. For example consider the following code:

def book = new Book(title:"New Grails Book")
def author = Author.get(1)
book.author = author
book.save()

In this example the association link is being created by the child (Book) and hence it is not necessary to manipulate the collection directly resulting in fewer queries and more efficient code. Given an Author with a large number of associated Book instances if you were to write code like the following you would see an impact on performance:

def book = new Book(title:"New Grails Book")
def author = Author.get(1)
author.addToBooks(book)
author.save()
Walter