I have two queries that I'm UNION
ing together such that I already know there will be no duplicate elements between the two queries. Therefore, UNION
and UNION ALL
will produce the same results.
Which one should I use?
I have two queries that I'm UNION
ing together such that I already know there will be no duplicate elements between the two queries. Therefore, UNION
and UNION ALL
will produce the same results.
Which one should I use?
You should use the one that matches the intent of what you are looking for. If you want to ensure that there are no duplicates use UNION
, otherwise use UNION ALL
. Just because your data will produce the same results right now doesn't mean that it always will.
That said, UNION ALL
will be faster on any sane database implementation, see the articles below for examples. But typically, they are the same except that UNION
an extra step to remove identical rows (as one might expect), and it may tend to dominate execution time.
I would use UNION ALL
anyway. Even though you know that there are not going to be duplicates, depending on your database server engine, it might not know that.
So, just to provide extra information to DB server, in order for its query planner a better choice (probably), use UNION ALL
.
Having said that, if your DB server's query planner is smart enough to infer that information from the UNION
clause and table indexes, then results (performance and semantic wise) should be the same.
Either case, it strongly depends on the DB server you are using.
According to http://blog.sqlauthority.com/2007/03/10/sql-server-union-vs-union-all-which-is-better-for-performance/ at least for performance it is better to use UNION ALL, since it does not actively distinct duplicates and as such is faster
I see that you've tagged this question PERFORMANCE, so I assume that's your primary consideration.
UNION ALL
will absolutely outperform UNION
since SQL doesn't have to check the two sets for dups.
Unless you need SQL to perform the duplicate checking for you, always use UNION ALL
.
Since there will be no duplicates from the two use UNION ALL. You don't need to check for duplicates and UNION ALL will preform the task more efficiently.