views:

99

answers:

2

I use the following code to select popular news entries (by date) from the database:

popular = Entry.objects.filter(type='A', is_public=True).extra(select = {'dpub': 'date(dt_published)'}).order_by('-dpub', '-views', '-dt_written', 'headline')[0:5]

To compare the execution speeds of a normal query and this one I ran the following mysql queries:

SELECT *, date(dt_published) as dpub FROM `news_entry` order by dpub DESC LIMIT 500

# Showing rows 0 - 29 (500 total, Query took 0.1386 sec)

-

SELECT * , DATE( dt_published ) AS dpub FROM  `news_entry` ORDER BY id DESC LIMIT 500

# Showing rows 0 - 29 (500 total, Query took 0.0021 sec) [id: 58079 - 57580]

As you can see the normal query is much faster. Is there a way to speed this up?

Is it possible to use mysql views with django?

I realize I could just split the datetime field into two fields (date and time), but I'm curious.


Structure:

CREATE TABLE IF NOT EXISTS `news_entry` (
  `id` int(11) NOT NULL DEFAULT '0',
  `views` int(11) NOT NULL,
  `user_views` int(11) NOT NULL,
  `old_id` int(11) DEFAULT NULL,
  `type` varchar(1) NOT NULL,
  `headline` varchar(256) NOT NULL,
  `subheadline` varchar(256) NOT NULL,
  `slug` varchar(50) NOT NULL,
  `category_id` int(11) DEFAULT NULL,
  `is_public` tinyint(1) NOT NULL,
  `is_featured` tinyint(1) NOT NULL,
  `dt_written` datetime DEFAULT NULL,
  `dt_modified` datetime DEFAULT NULL,
  `dt_published` datetime DEFAULT NULL,
  `author_id` int(11) DEFAULT NULL,
  `author_alt` varchar(256) NOT NULL,
  `email_alt` varchar(256) NOT NULL,
  `tags` varchar(255) NOT NULL,
  `content` longtext NOT NULL
) ENGINE=MyISAM DEFAULT;
+1  A: 
SELECT *, date(dt_published) as dpub FROM `news_entry` order by dpub DESC LIMIT 500

This query orders on dpub, while this one:

SELECT * , DATE( dt_published ) AS dpub FROM  `news_entry` ORDER BY id DESC LIMIT 500

orders on id.

Since id is most probably a PRIMARY KEY for your table, and each PRIMARY KEY has an implicit index backing it, ORDER BY does not need to sort.

dpub is a computed field and MySQL does not support indexes on computed fields. However, ORDER BY dt_published is an ORDER BY dpub as well.

You need to change your query to this:

SELECT *, date(dt_published) as dpub FROM `news_entry` order by date_published DESC LIMIT 500

and create an index on news_entry (dt_published).

Update:

Since DATE is a monotonic function, you may employ this trick:

SELECT  *, DATE(dt_published) AS dpub
FROM    news_entry
WHERE   dt_published >=
        (
        SELECT  md
        FROM    (
                SELECT  DATE(dt_published) AS md
                FROM    news_entry
                ORDER BY
                        dt_published DESC
                LIMIT 499, 1
                ) q
        UNION ALL
        SELECT  DATE(MIN(dt_published))
        FROM    news_entry
        LIMIT 1
        )
ORDER BY
        dpub DESC, views DESC, dt_written DESC, headline
LIMIT 500

This query does the following:

  • Selects the 500th record in dt_published DESC order, or the first record posted should there be less than 500 records in the table.

  • Fetches all records posted later than the date of the last record selected. Since DATE(x) is always less or equal to x, there can be more than 500 records, but still much less than the whole table.

  • Orders and limits these records as appropriate.

You may find this article interesting, since it covers a similar problem:

Quassnoi
Thank you for the datailed answer. However this is not the answer to my question. I know I can't create an index on a computed field. But I'm not very familiar with views so I thought they might help me if I knew more about them. It is not possible to order by dt_published since I also want to order by views/clicks. order_by('-dpub', '-views', '-dt_written', 'headline')
Arnar Yngvason
@Arnar: `MySQL` cannot push predicates into the views: the view will be executed and buffered first, then all other predicates applied. If you need to order by additional fields, you can implement a certain trick (I'll put it to the post now).
Quassnoi
Thanks, I think your update covers it all.
Arnar Yngvason
A: 

May need an index on dt_published. Could you post the query plans for the two queries?

zevra0
I tried that but the problem is that dpub is a computed field and it can't be an index.
Arnar Yngvason
No, not on dpub... create index inx_news_entry_dt on news_entry (dt_published); It may help.
zevra0
added table structure to question, and yes, I did that.
Arnar Yngvason