views:

182

answers:

5

I understand that the "Cannot sort a row of size 9754, which is greater than the allowable maximum of 8094." from SQL server is caused by a row that has over 9k characters in it, which is greater than the page size limit in SQL Server 7. But I'm not calling an order by on the data below, so why does the error say it cannot sort?

SELECT <a number of columns...>
FROM Category10Master c10
JOIN Category20Master c20 ON c10.Cat10ID = c20.ParentCatID 
JOIN Category25Master c25 ON c20.Cat20ID = c25.ParentCatID 
JOIN Category30Master c30 ON c25.Cat25ID = c30 .ParentCatID 
JOIN Item i ON c30.Cat30ID = i.ParentCatID

EDIT: And yes, I know I can call fewer columns to solve this - the actual query calls the columns needed explicitly and still exceeds the row size limit. This is actually in a view that's called site-wide so changing the view to pull back fewer columns isn't an attractive option - dozens of pages would need to be modified to get their data from somewhere other than the view. I'm unlucky enough to have interhited an ugly design and am hoping someone has a more attractive solution than pulling less data.

A: 

Well, the point is: Your row is too big. You are not making an explicit sort, but maybe the query engine decided the sort something for performance purposes?

Either way, you have to do something about your row size, for example not selecting "*", but only the data you really need.

Maximilian Mayerl
+3  A: 

Check the estimated query plan for the sort, to perform one of the joins it may be choosing a merge join for example, and to achieve this it requires the data to be sorted first prior to the merge - at that point you've sorted.

Andrew
Wow, you're right Andrew, the explain plan shows it's doing a sort on a hash table. It doesn't resort to using a hash table until I've added a certain number of tables to the join, so I'm guessing at some point it determines there's enough data to warrant using a hash table, and thus sorting to populate that hash table.
Cory House
My memory of hints back in SQL Server 6 (assuming that was correct in your post) is just not going to be accurate, in 2005 you can override the join mechanisms and usually pay a performance price by writing inner loop join, inner merge join, inner hash join or inner remote join (last one is for linked servers). I've no idea whether the inner loop join would owrk in 6, it might not even be legal syntax. It's a real hack but might move you forward a bit for now.
Andrew
If I could upvote you again I would! The join override does work, though it does have a hit to performance as expected, though I think the production server will be able to handle it fine. You the man!
Cory House
A: 

My wild guess will be that SQL Server needs to sort data somewhere in temporary tables to make joins more performant. It obviously cannot do that since you're selecting each and every column from what, 4 tables.

Consider replacing select * with something more appropriate.

Anton Gogolev
I put * in my example to keep the example short. The actual query is requesting specific necessary columns. I've edited my question for clarity.
Cory House
A: 

You may be able to split up your query with a CTE, derived tables, and or #temp table to get around your problems, but I don't know enough about your tables, returned columns, or indexes to make any specific recommendations.

Without more info it is impossible to solve this for you. What are the table definitions? are you JOINing columns of type INTs or varchar(1000)? do you have covering indexes? are you selecting columns from each table, or only some?

possibly use covering indexes to get all the primary keys in a CTE, derived tables, and or #temp table, and then join back to the regular tables to get the columns you are interested in.

KM
Horridly, unless the OP mistyped, it's SQL Server 6 - which is a real problem in itself. Btw - I posted that table scan with procedure name you wanted.
Andrew
Good catch. I misstyped. It's SQL Server 7.
Cory House
KM - There are indexes on all columns being joined. All are ints. Data is being selected from all tables being joined. Looks like a CTE isn't an option until SQL server 2005, correct?
Cory House
yep, no CTE until 2005
KM
+2  A: 

I know you don't want to hear this but if your view is running up against the actual limits of SQL server, then you need to redesign. This is a problem that won't go away with a some sort of hack. Every time someone wants to do queries that hit the rows that are too big, you are going to encounter the same problem. Now it is possible that you have bad data in the rows that are too big. But I'm betting on just plain bad design. Yes it's a lot of work to refactor a bad database design, but this is only going to get worse through time not better.

Do you get the same error if you directly query the tables instead of using a view? I suspect that you are trying to pivot information from similar tables (based on the table names). Can you do a union all instead and have the application do the pivoting? That way the query at least would have fewer columns.

I know I'm, preaching to the choir here, but it really is time to get your organization to upgrade to a newer version of SQl server. Even now you can't directly upgrade to 2008, so you are in real danger of losing any upgrade path at all if you can't get SQL server 2005. If this is business critical data, you cannot afford to wait on upgrading any longer.

HLGEM
+1 : Entirely right.
Andrew
Yes, it's certainly a bad design. And in the end I actually removed some columns from the view that I found were need in only a few spots to resolve the issue. Regarding the db upgrade, I couldn't agree more. Some clients are simply unwilling to spend a dime on anything but functionality enhancements.
Cory House