views:

42

answers:

3

I have a large SQL Server database with about 40 columns and hundreds of millions of rows.

This table is supposed to be loose in schema so I have a lot of columns as VARCHAR(MAX) even where it could have been BIGINT, DATETIME, INT etc. Does this have an impact on querying time/efficiency? e.g. will

SELECT TOP 100 * FROM CustomerId = 34343

be faster than

SELECT TOP 100 * FROM CustomerId = '34343'

? If yes, how much faster?

And what if I use VARCHAR(MAX) instead of fixed length VARCHAR.. And what about other DBs like mySQL etc. in this regard?

+3  A: 

Yes, comparing strings is usually slower than comparing pure numbers. Whether it is measurable depends on how the query execution engine does the comparison. If the query engine does not compare to the end of the strings - which it often won't, then your penalty is not great. Try it and see. But in theory, you'd be better off with the numeric comparison for numeric quantities.

Jonathan Leffler
ok... I was hoping for some quantitative report on the effect if somebody has some source for that.. thanks!~
Raze2dust
+2  A: 

Yes, there's definitely a performance benefit from using INT vs. VARCHAR(MAX) for comparisons. How much is really hard to say without actually measuring.

Also - there's no reason not to use VARCHAR(MAX) - but only when it's needed and when it makes sense!

See:

for some good reasons why you shouldn't just make everything VARCHAR(MAX) - just because you could....

marc_s
+1  A: 

And what if I use VARCHAR(MAX) instead of fixed length VARCHAR.. And what about other DBs like mySQL etc. in this regard?

  • PostgreSQL treats VARCHAR(n) as if it were TEXT CHECK(LENGTH(Column) <= n). There's no performance advantage to specifying a maximum length.
  • SQLite completely ignores length limits on VARCHAR columns.
  • MS SQL Server, however, does not allow creating indexes on VARCHAR(MAX) columns, which decreases performance.
dan04