tags:

views:

212

answers:

4

django has this complex ORM built in to it, but after spending much time on it, it is still hard for me to make queries that are remarkably simple in SQL. There are even some simple things that I can't find a way to do through the django ORM (e.g. 'select distinct column1 from tablename').

Is there any documentation that shows "For common SQL statements, here is how you do it in django"?

(I did try google first, but either it isn't out there or I just can't think of the right query...)

+3  A: 

A good starting point for doing Django queries is the Django docs themselves.

http://docs.djangoproject.com/en/dev/topics/db/queries/

Here are a few examples:

select * from table
=
ModelName.objects.all()

filtering:

select * from table where column = 'foo'
=
ModelName.objects.filter(column='foo')

Specifically regarding the use of distinct you use the distinct() method of a Django queryset.

Here's the relevant link in the documentation. http://docs.djangoproject.com/en/dev/ref/models/querysets/#distinct

Update: The ORM helps you by allowing you to use Object Oriented interactions with your data. You don't write code that translates the resultset of your query into a set of objects. It does it automatically. That's the fundamental change in thought process you have to make.

You start to think in terms of 'I have this object, I need to get all the other objects which are like it' Then you can ask the ORM for those objects. ORM, I need all the objects of Class Product which have an attribute of color "blue"

Django's specific ORM language for that is:

products = Product.objects.filter(color='blue')

This is done instead of:

  • writing your sql query,
  • properly escaping all the arguments,
  • connecting to the database,
  • querying the database and handling the connection / query errors,
  • getting the result set,
  • iterating across the result set translating the returned values into proper objects which you can call methods on.

That's the value in using an ORM. Code simplification and reduced development time.

wlashell
Of course, I read the documentation before asking, though in the modern world I can understand why you might assume otherwise.My problem is that the documentation doesn't answer my questions very effectively. I never found RDBMS difficult to understand, and it is not at all clear how the ORM is supposed to be "helping" me. I was hoping to find "A Traveler's Phrase Book: Django for SQL-speakers".
sienkiew
Updated my answer for you, I hope this helps explain the thought process more.
wlashell
+4  A: 

There are some things that are ridiculously simple in SQL that are difficult or impossible through an ORM. This is called the "object-relational impedance mismatch." Essentially an ORM treats each row in a database as a separate object. So operations that involve treating values separately from their row become fairly challenging. Recent versions of Django (1.1+) improve this situation somewhat with aggregation support, but for many things, only SQL will work.

To this end, django provides several methods of letting you drop down into raw sql quite simply. Some of them return model objects as results, while others take you all the way down to your DBAPI2 connector. The most low level looks like this:

from django.db import connection

cursor = connection.cursor()
cursor.execute("SELECT DISTINCT column1 FROM tablename")
row = cursor.fetchone()

If you want to return a queryset from a SQL query, use the raw() on your model's manager:

qs = ModelName.objects.raw("""SELECT first_name 
                              FROM myapp_modelname 
                              WHERE last_name = 'van Rossum'")
for person in qs:
     print person.first_name # Result already available
     print person.last_name  # Has to hit the DB again

Note: raw() is only available in the development version of Django, which should be merged into trunk as of 1.2.

Complete information is available in the documentation under Performing raw SQL queries.

jcdyer
This did it for me. The key point is that the ORM extracts a set of objects ("object" ~= "entire row"), but a result row of SELECT DISTINCT cannot be connected to a single object. So, if you need objects, use the ORM; if you need a result where a row cannot reliably be mapped to a single object, use SQL directly. Thanks.
sienkiew
+3  A: 

Think of it this way.

"For common SQL hack-arounds, what was the object-oriented thing I was supposed to be doing in the first place?"

The issue isn't that the ORM is complex. It's that your brain has been warped in the SQL mold, making it hard to see the objects clearly.

General rules:

  • If you think it's a simple SELECT FROM WHERE, stop. Ask what objects you needed to see in the result set. Then find those objects and work with the object manager.

  • If you think it's a simple JOIN, stop. Ask what primary object you want. Remember, objects don't use foreign keys. Join doesn't mean anything. An object seem to break 1NF and contain and entire set of related objects within it. Then find the "primary" objects and work with the object manager. Use the related objects queries to find related objects.

  • If you think it's an OUTER JOIN, stop. Ask what two things you want to see in the result set. An outer join is things which will join UNIONED with things that won't join. What are the things in the first place. Then find the "primary" objects and work with the object manager. Some will have sets of related objects. Some won't.

  • If you think it's a WHERE EXISTS or or WHERE IN with a subquery, your model is probably incomplete. Sometimes, it requires a fancy join. But if you're doing this kind of checking, it usually means you need a property in your model.

  • If you think you need SELECT DISTINCT, you've missed the boat entirely. That's just a Python set. You simply get the column values into a Python set. Those are the distinct values.

  • If you think you need a GROUP BY, you're ignoring Python collections.defaultdict. Using Python to to GROUP BY is usually faster than fussing around with SQL.

    Except for data warehousing. Which you shouldn't be doing in Django. You have to use SQLAlchemy for data warehousing.

S.Lott
+2  A: 

For your specific how-to, you'd do it like this:

MyModel.objects.values_list('column1', flat=True).distinct()

But other posters are correct to say you shouldn't be thinking 'how do I write this SQL in the ORM'. When you learned Python, coming from Java or C++ or whatever, you soon learned to get out of the mindset of 'how do I write this Java code in Python', and just concentrated on solving the problem using Python. The same should be true of using the ORM.

Daniel Roseman