views:

230

answers:

7
+1  Q: 

OO and SQL

In trying to understand the correlation between program objects and data in tables (here:http://stackoverflow.com/questions/1149877/oo-program-and-sql-database), which I still don't really understand, a strong difference of opinion was revealed over whether it's ok to use a SQL db to store data for an OO program.

  1. For a new programmer, is it ok/recommended, or not?

  2. If not, is the alternative to an OO program a procedural program?

Also I don't understand Object-relational impedance mismatch, which is what people mean when they say that it's "not OK to use an SQL DB to store data for an OO program": can someone summarize it, or is there a simple example that illustrates it?

+2  A: 

I would say it if definitly OK to use an SQL Database to store data for an Object-oriented program (I do this almost all the time -- and it works quite fine)

Procedural vs Object-Oriented is quite a debate ; I don't think you should choose depending on the storage type you're using : the choice should be more about your code and your application.

OOP comes with many advantages, one of which is (in my opinion) better maintenability ; better isolation, encapsulation, and all that is true too. Actually, after having programmed in OOP for a couple of years, I would find hard (and I actually do...) to do procedural programming ; the ability of having data and related-methods to manipulate them in a single class, used a an independant entity, is really great.

Pascal MARTIN
Thanks for your help!
ChrisC
Procedural? You mean declarative for SQL?
gbn
I was meaning for the program side : "procedural program" vs "oop"
Pascal MARTIN
+2  A: 

One problem with having one class = one table is that of normalization. There are also relationships between classes which don't map easily one to one. So my recommendation is - when doing the design for the database, do a proper schema; when doing the object oriented design, do it so without any dependency on the DB (hence the DB could be XML, flat-files, or even hand-coded array), and then write another class which would fetch the values from the form of the database which you have chosen.

Extrakun
Is it necessary to do the OO program design before I can do a proper schema? What are the characteristics of a proper schema?Thank you for your help.
ChrisC
Those two are separate discipline. The definition of a proper schema is a vast topic by itself. Some rules of thumb is normalization, accurate representation of relationship between entities and etc. Object oriented design has some overlap with database design but the biggest difference is that for OO you have to keep in mind entities can have behaviours and a relationship between two entities can be more than just containment or inheritance.
Extrakun
Thanks for your help. I see what you're saying about the separate disciplines, but as the developer I would have to develop both. I'm sorry, maybe I'm not asking clearly, I was just trying to find out how much the program and db need to consider each other during design. Would I design the oo program before ever thinking about the tables in the db, then design the db according to how the program access or uses the data? Thanks again.
ChrisC
Chris: Yes you should. Unless you do the DB Schema first (the "database design") and then derive the program design from it. Either way, they are strongly linked so they cannot be done independent of each other.
RBarryYoung
I'm sorry, I don't understand. I asked, "Would I design the oo program before ever thinking about the tables in the db, then design the db according to how the program access or uses the data?" And you said, "Yes you should. Unless you do the DB Schema first (the "database design") and then derive the program design from it. Either way, they are strongly linked so they cannot be done independent of each other." So, if I design the program first, then design the db according to how the program uses it, is that not designing them independently? It seems like a paradox (to me). Thanks a lot!
ChrisC
I usually do the DB design first, then the classes. I think it is a matter of personal preference (but if working in a team, follow the team's policy). It's also a matter of experience. Try out both approach and get a feel, but I usually believe in decoupling the data loading/saving process from the classes itself, so that my influence my beliefs somehow.
Extrakun
"Is it necessary to do the OO program design before I can do a proper schema?" --> "Is it necessary to decide what I'm storing before I choose where/how to store it?"
Dems
A: 

SQL databases are relational databases. Relational databases are currently the king of the hill with respect to data storage and retrieval (including program objects), except for the noSQL movement, which deals primarily with extremely large datasets.

To assist in the translation process between objects and relational tables, most programmers use some sort of Object Relational Mapper like Linq to SQL or nHibernate. This greatly reduces the effort required.

There are object oriented databases that can read and write object instances directly, but these have not caught on in the mainstream because of difficulties in reporting and retrieval of data across many objects.

Robert Harvey
Thank you for your help.
ChrisC
A: 

First, using OO with SQL is fine. Many developers do it all the time and frankly, there's no other option available today that I think is even half as good.

That said, that are still some problems. The central one IMHO is that the Designers/Architects want Business Classes to map 1-1 to DB Tables and they just don't. It starts out well enough in high-level design, but the real problem comes you try to implement sub-classing: it just doesn't map very well to SQL and relational concepts.

So some means of mediating this slight incompatibilty (the "impedance mismatch") has to be brought into the development. There are lots of different "means" of doing this (tools, language features, DB types and options, as well as design & development disciplines) and they all have different advantages and disadvantages.

The upshot that you CAN do it, and you SHOULD do it, indeed many of us MUST do it, but there are definitely some dirty parts in the middle.

RBarryYoung
Thanks! I'm looking at writing a checkbook program (I already have a working Access db that does most of what I want to use as a functional model) and I've been convinced by others I should write it OO (for the practice, and the maintainability), with a SQL back-end, so with respect to the "dirty parts in the middle", which language is better to use: Java, C++, or Python? And should I let the "dirty parts in the middle" be a deciding factor in choosing a language? (The only training/experience is the VBA in the Access db and an OO Logic class and an Intro to C++ class)Thanks so much.
ChrisC
I couldn't say as I'm not very proficient in any of those languages. IIRC, Python is a multi-paradignm language (OO, functional, dynamic?), so it's likely to be a substantially different situation from the other two. The answer probably has more to do with what tools are available to you in that language that the language itself. I don't know if Linq is available to C++ or not, but that's one possible starting point.
RBarryYoung
+3  A: 

For a new programmer, is it ok/recommended, or not?

That question implies that the system design/components/architecture are chosen/intended to benefit the programmer.

Instead I prefer to choose the design/components/architecture to benefit the system (and the system's owners, users, and operators), and ensure that the programmers know (which may require some learning or training on their part) what they need in order to develop that system.

The facts are:

  • OO is often a good way to develop software
  • SQL DBs are often a good way to store data
  • Designing the mapping from SQL to OO may be non-trivial

If not, is the alternative to an OO program a procedural program?

Well, maybe, yes: one of the features of OO is subclassing, and subclasses/inheritance is one of the things that's problematic to model/store in a SQL database.

For example, given a OOD like this ...

class Animal
{
  int id;
  string name;
  abstract void eat();
  abstract void breed();
}

class Dog : Animal
{
  bool pedigree;
  override void eat() {...}
  override void breed() {...}
}

class Bird : Animal
{
  bool carnivore;
  int numberOfEggs;
  void fly() {...}
  override void eat() {...}
  override void breed() {...}
}

... it isn't obvious whether to store this data using 2 SQL tables, or 3. Whereas if you take the subclassing away:

class Dog
{
  int id;
  string name;
  bool pedigree;
  void eat() {...}
  void breed() {...}
}

class Bird
{
  int id;
  string name;
  bool carnivore;
  int numberOfEggs;

  void fly() {...}
  void eat() {...}
  void breed() {...}
}

... then it's easier/more obvious/more 1-to-1/more accurate to model this data using exactly two tables.

Also I don't understand Object-relational impedance mismatch

Here's an article that's longer and more famous than the Wikipedia article; maybe it's easier to understand: The Vietnam of Computer Science

Note that one of the solutions, which is proposes to the problem, is:

"Manual mapping. Developers simply accept that it's not such a hard problem to solve manually after all, and write straight relational-access code to return relations to the language, access the tuples, and populate objects as necessary."

In other words, it's not such a hard problem in practice. It's a hard problem in theory, i.e. it's hard to write a tool which creates the mapping automatically and without your having to think about it; but that's true of many aspects of programming, not only this one.

ChrisW
Thanks for the great illustration and answer! I think I understand the dilemma now. This may be a dumb question, but given the "Vietnam" analogy (and the widely recognized problems) what is the reason people don't just map manually? It seems like the way to go.
ChrisC
I guess it's because sometimes people have dozens (or, heaven knows, more) of classes and/or tables, and want to automate the mapping; not because it's all that hard, but because it's tedious and (if don by hand) slightly error-prone (because humans are fallible). For example, I worry about whether my source code is correct, but not about whether my object code is correct: because I trust the compiler's automated mapping of source code to object code. As an OO programmer who's used to things like type-safety, hand-coding SQL is a little bit worrying: because it's just text, the compiler ...
ChrisW
... won't help to ascertain whether the SQL which you write is any good. Yes there are ways around that (e.g., *testing* the SQL), nevertheless it would be tempting to want an automated solution, if that were possible and had no disadvantages.
ChrisW
Thanks. I understood everything except this part, "As an OO programmer who's used to things like type-safety, hand-coding SQL is a little bit worrying". I guess the part I don't understand (and it may be peripheral to your main point, so tell if it's inconsequential) is the contrast between being used to "type safety" and hand coding SQL. Mainly I think I don't know what type safety is, because I do understand that SQL queries are handwritten. The reason I ask is because I wanted to make sure I understand you 100%. Thanks again!
ChrisC
Or are you just using the contrast between system-handled things being accurately rendered and handwritten having the human capacity for error? Sorry for my slowness.
ChrisC
I mean that when writing in C++ or C#, if I make a typo then the compiler will tell me that I have a syntax error: and if on the compiler's happy with it, that's some assurance (to me) that my code is correct. Furthermore if someone renames a class, for example, or removes a member data field which I am accessing, then my code will no longer compiler (i.e. the compiler will catch the error); but if someone renames a SQL table for example, or removes a SQL column, there's no compiler to tell me that the SQL commands embedded as text fragments in my code are no longer valid. For this reason, ...
ChrisW
... I find plain, hand-written SQL to be less inherently maintainable than code which has compile-time type-checking (e.g. C++ and not e.g. JavaScript).
ChrisW
A: 
ChrisC
"Yes" to your first question, and "I don't know" or "I don't understand your question/suggestion" to your second. Let's say I have a mapper which accepts some meta-data as input, and which outputs OO classes and SQL tables and the glue between them: instead of editing any of that by hand (and risking e.g. old code with new tables), the dream might be to edit the meta-data and re-generate the whole (self-consistent) system.
ChrisW
Thanks Chris W. You've really helped me a lot. I don't quite understand your example, but I can't ask for any more of your time considering how much you've given and how patient you've been with me here. And hopefully after I learn more about how things fit together and run, I'll understand it then. Have a good one!
ChrisC
A: 

Chris W.'s response is excellent, one of the best I've read on this subject. I just want to throw in my two cents about one aspect: the problem of expressing subclassing in SQL table design.

If you do a web search on "generalization specialization relational modeling" you will get several good articles on the subject of subclassing. It's a problem that comes up repeatedly in conceptual data modeling. It's a pity that courses for database neophytes don't include this topic more often.

Walter Mitty