views:

248

answers:

5

I'm taking on the re-architecting of a pair of applications which use Hibernate in one case, and a combination of Hibernate and a Java Content Repository (specifially JackRabbit) in the second.

A key issue in the rearchitecting is to improve performance, so I'm wondering whether there's any value in bringing in a DBA for the design and development of the application.

Note that I'm not questioning the value in having a DBA involved in managing the production databases. But in past projects, it's been essential to have a good DBA involved in the design and coding phases, working out ways to optimize the data structures, putting code into stored procedures, etc.

But given that the database structures are almost completely managed by Hibernate and JackRabbit, there's not much scope to optimize them. Sure, if we find they don't perform well a DBA could potentially identify issues and we could submit patches to improve them, but I don't know that we would want (or be able) to do much in the way of application-specific tuning.

Another reason for wondering about the role of a DBA in this type of application is that the bulk of our performance issues are most likely above the persistence layer, i.e. it's not that the database, hibernate, or JackRabbit are too slow, it's that the way we have structured our data and push it around is not very good. Fixing this will involve data modeling, but the implementation medium is XML files and Java code rather than database tables and SQL. Does a DBA typically know much about this type of thing?

The thing that keeps me from completely dismissing the need for a DBA in the design and development of an application built on top of a persistence layer is skepticism. I don't quite believe that the need for database optimization for a specific application is complely waved away by using a pre-packaged solution.

Am I missing key points? Can a skilled DBA tweak hibernate configuration files to make things blazingly fast for my app's specific use cases? Is it madness to consider running a high load Hibernate app without having a DBA manually tune the DB itself, building indices, etc.? Or is there a new creature in the development landscape who specializes in optimizing XML-based data models and abstracted persistence layers?

Thanks in advance for thoughts and suggestions!

+2  A: 

Is it madness to consider running a high load Hibernate app without having a DBA manually tune the DB itself, building indices, etc.?

Yes, since (AFAIK) Hibernate doesn't do any optimisation of the DB, since these things are always workload dependent.

To address your bigger question: of course you need someone who is apt in tuning the Database for performance and yes, using hibernate does change the required skill set.

David Schmitt
A: 

I'd say it depends on your app - you can still do native queries with Hibernate - so it depends whether there are any of those that might be present and need tuning. Similarly it depends on performance required - if there is any performance critical sections, you may need support in identifying what is slowing that section down. Also some DBs just need more admin than others (Oracle...)

Chris Kimpton
A: 

I agree with David. It's even worse: the developper that use the persistence layer should have a good DB knowledge, to understand why some of their calls are time-expensive and how to find workaround.

Nicolas
+2  A: 

Hibernate can control database structures. That doesn't mean that hibernate should control them.

If you have a large app with a lot of data and performance is critical, I probably wouldn't use auto-generated table definitions. I'd want a completely optimized database structure and then write the Hibernate mappings to use that. If you get a DBA that understands development a little, they can even write the HQL or the custom SQL to make things better.

(I've never used JackRabbit, so I can't comment there)

Also, it's probably the DBA who will be helping you troubleshoot performance problems during testing.

Marc Hughes
+5  A: 

There are DBA's and there are DBA's. Some DBA's are administrators -- backup, restore, grant, revoke -- kind of people. Keep The Lights On. Foundational.

Other DBA's are architect/designers. "Fixing this will involve data modeling" That's what this second tier of DBA's should be doing.

Many admin DBA's are thrust into the architect role -- they know SQL, after all -- but aren't really suited to it. You know you've got the wrong person when...

  1. They obsess over table and column naming conventions.

  2. They obsess over FK/PK relationships, ignoring the fact that once you've fetched the rows and made them into objects you have a lot of rich, sophisticated collection classes available to manage relationships.

  3. They can't divorce rows in a table from Objects in the application and the real-world entities that both things are implementations of. This can often be a show-stopper. If you have a complex real-world object that is implemented by a complex programming-language structure and also maps to a complex database structure, it can get confusing. And some people retreat to their comfort zone and start repeating meaningless phrases like "It's all just bits" or "ultimately, everything's a FK, even object references".

  4. Demanding everything be a stored procedure "because it's faster." This is worse if they can't provide evidence.

Here's the point...

Performance rests on two things: Data Structures and Algorithms. Minimizing resource use (I/O, memory, etc.), is done by picking the right data structures and algorithms.

Database denormalization is a way of tweaking the data structure to match the algorithm. Other performance tuning is largely the same concept: changing parameters and options to make the data structure better match the application algorithm.

This should go both ways. You should look at your entities, your requirements, and work out both data structures and algorithms that do the right thing. Once you've done that, you can tweak up the sizes of buffers and what-not to get a little bit better performance.

Fundamentally, blazing speed comes from considering the inner-most inner-most loops: what are they looping over? What are they searching for? How can they be replaced with something that doesn't loop as much or doesn't loop at all?

If your DBA can participate in the algorithm and data structure design, they're an asset, use them heavily.

If your DBA can't participate, then don't limit your design to what they're comfortable with.

S.Lott