I understand that prior to SQL Server 2005, you could "trick" SQL Server to allow use of an order by in a view definition, by also include TOP 100 PERCENT in the select clause. But I have seen other code which I have inherited which uses SELECT TOP 100 PERCENT ... within dynamic SQL statements (used in ADO in a ASP.NET apps, etc). Is there any reason for this? Isn't the result the same as not including the TOP 100 PERCENT ?
If there is no ORDER BY
clause, then TOP 100 PERCENT
is redundant. (As you mention, this was the 'trick' with views)
[Hopefully the optimizer will optimize this away.]
No reason but indifference, I'd guess.
Such query strings are usually generated by a graphical query tool. The user joins a few tables, adds a filter, a sort order, and tests the results. Since the user may want to save the query as a view, the tool adds a TOP 100 PERCENT. In this case, though, the user copies the SQL into his code, parameterized the WHERE clause, and hides everything in a data access layer. Out of mind, out of sight.
I have seen other code which I have inherited which uses SELECT TOP 100 PERCENT
The reason for this is simple: Enterprise Manager used to try to be helpful and format your code to include this for you. There was no point ever trying to remove it as it didn't really hurt anything and the next time you went to change it EM would insert it again.
...allow use of an ORDER BY in a view definition.
That's not a good idea. A view should never have an ORDER BY defined.
An ORDER BY has an impact on performance - using it a view means that the ORDER BY will turn up in the explain plan. If you have a query where the view is joined to anything in the immediate query, or referenced in an inline view (CTE/subquery factoring) - the ORDER BY is always run prior to the final ORDER BY (assuming it was defined). There's no benefit to ordering rows that aren't the final result set when the query isn't using TOP (or LIMIT for MySQL/Postgres).
Consider:
CREATE VIEW my_view AS
SELECT i.item_id,
i.item_description,
it.item_type_description
FROM ITEMS i
JOIN ITEM_TYPES it ON it.item_type_id = i.item_type_id
ORDER BY i.item_description
...
SELECT t.item_id,
t.item_description,
t.item_type_description
FROM my_view t
ORDER BY t.item_type_description
...is the equivalent to using:
SELECT t.item_id,
t.item_description,
t.item_type_description
FROM (SELECT i.item_id,
i.item_description,
it.item_type_description
FROM ITEMS i
JOIN ITEM_TYPES it ON it.item_type_id = i.item_type_id
ORDER BY i.item_description) t
ORDER BY t.item_type_description
This is bad because:
- The example is ordering the list initially by the item description, and then it's reordered based on the item type description. It's wasted resources in the first sort - running as is does not mean it's running:
ORDER BY item_type_description, item_description
- It's not obvious what the view is ordered by due to encapsulation. This does not mean you should create multiple views with different sort orders...
TOP (100) PERCENT is completely meaningless in recent versions of SQL Server, and it (along with the corresponding ORDER BY, in the case of a view definition or derived table) is ignored by the query processor.
You're correct that once upon a time, it could be used as a trick, but even then it wasn't reliable. Sadly, some of Microsoft's graphical tools put this meaningless clause in.
As for why this might appear in dynamic SQL, I have no idea. You're correct that there's no reason for it, and the result is the same without it (and again, in the case of a view definition or derived table, without both the TOP and ORDER BY clauses).
It was used for "intermediate materialization (Google search)"
Good article: Adam Machanic: Exploring the secrets of intermediate materialization
He even raised an MS Connect so it can be done in a cleaner fashion
My view is "not inherently bad", but don't use it unless 100% sure. The problem is, it works only at the time you do it and probably not later (patch level, schema, index, row counts etc)...
Worked example
This may fail because you don't know in which order things are evaluated
SELECT foo From MyTable WHERE ISNUMERIC (foo) = 1 AND CAST(foo AS int) > 100
And this may also fail because
SELECT foo
FROM
(SELECT foo From MyTable WHERE ISNUMERIC (foo) = 1) bar
WHERE
CAST(foo AS int) > 100
However, this did not in SQL Server 2000. The inner query is evaluated and spooled:
SELECT foo
FROM
(SELECT TOP 100 PERCENT foo From MyTable WHERE ISNUMERIC (foo) = 1 ORDER BY foo) bar
WHERE
CAST(foo AS int) > 100
Note, this still works in SQL Server 2005
SELECT TOP 2000000000 ... ORDER BY...