views:

130

answers:

6

Hi, I have existing application which uses Sql server 2005 as a backend. It contains huge records, I need to join tables which contain 50000-70000. In client machine it slowdown.

So, can I improve its performance by using XML as a backend? I have checked that the size of xml file becomes 35MB to 50MB, when I load data. I also need to search operation on xml file. SO, which one is better for performance.

A: 

Which one is better for performance is going to depend on how you use them. Obviously joining on 70K rows is bad for performance. However, you can do things with XML that would be just as bad. Consider how you are going to use the data and make some reasonable optimizations. In SQL Server there is probably some way you can denormalize your data and it will solve your problem, but without a schema and some code nobody can say for sure.

jarrett
In this case its not going to matter. If he had 100 records then an XML file would be ok, and probably better than running a full scale SQL server on a slow computer. But with a database of this size, XML would clearly be a much slower, and a very poor choice.
Øyvind Bråthen
Don't think so 70K rows is not big for SQL server. Denormalization is unlikely to help.
MikeAinOz
I've had joins on 70K rows that performed horrendously but with denormalization the problem could be resolved. I've also had XML files with thousands of nodes that with some caching perform totally fine. A 25MB text file is not that big. We have no idea what he is doing or how the data is structured from the question.
jarrett
+4  A: 

SQL Server is definitely the better option for that kind of size.

If your SQL database is operating very slowly, this indicates that you have not created the necessary indexes. You need to create indexes so that the SQL engine can much more easily and efficiently look up rows by specific values. For example, if you have a table of customers, and you often want to find customers by their name, then you should index the Name column. Since you say you join large tables, you should index the columns on which you join.

50,000 records is not actually that many. The last time I maintained a database running on SQL Server, it had tables with millions of rows (and reasonably so; the data was not crazy redundant or anything) and joining against those tables in very complex queries was absolutely fine.

Timwi
Yes, I have done same- applying indexes to appropriate columns, But my problem is that I need to run application on limited m/c which have lower h/w.
Naresh
Then how is that getting better with a slower approach? These days many shops save on hardware. Get appropriate hardare to run your database - I regularly join 650 MILLION row records without problems. You can not participate in formula 1 with a streeet car and win. That simple. Crap hardware = hardware upgrade.
TomTom
+6  A: 

Can I improve its performance by using XML as a backend?

Most definitely not! - Xml is slow, use SQL Server instead.

Xml was originally designed with interoperability in mind, not performance. As a result the verbose and dynamic nature of the xml format makes it relatively expensive to read, and nigh on impossible to index. There is absolutely no reason to believe that using Xml is ever the solution to performance issues, in any situation.

Don't get me wrong - I think Xml is frankly awesome, but it's definitely not quick. SQL Server on the other hand is designed from the ground up with performance and scalability in mind.

I wrote another answer to a similar question which you might find relevant.

Kragen
A: 

Joel Spolsky addresses this in Back to Basics.

Last week I wrote that you can't implement the SQL statement SELECT author FROM books fast when your data is stored in XML. Just in case everybody didn't understand what I was talking about, and now that we've been rolling around in the CPU all day, this assertion might make more sense.

How does a relational database implement SELECT author FROM books? In a relational database, every row in a table (e.g. the books table) is exactly the same length in bytes, and every fields is always at a fixed offset from the beginning of the row. So, for example, if each record in the books table is 100 bytes long, and the author field is at offset 23, then there are authors stored at byte 23, 123, 223, 323, etc. What is the code to move to the next record in the result of this query? Basically, it's this:

pointer += 100;

One CPU instruction. Faaaaaaaaaast.

Now lets look at the books table in XML.

<?xml blah blah>
<books>
     <book>
          <title>UI Design for Programmers</title>
          <author>Joel Spolsky</author>
     </book>
     <book>
          <title>The Chop Suey Club</title>
          <author>Bruce Weber</author>
     </book>
</books>

Quick question. What is the code to move to the next record?

Uh...

dan04
ah - no. In a SQL Dtabase every row is NOT exactly the same size for some time. Databases optimize more - at least proper ones like SQL Server - for the last about 20 ye4ars. Read up in the documentation how many bytes fields (like varchar etc.) use. Hint: it is no fixed, it depends on the size of the string stored. LEARN how things work before talkling.
TomTom
@TomTom: You could have easily said the same thing, but without the condescending tone. (I’ll resist the temptation of mocking the five typos you managed to make in such a short comment.)
Timwi
+2  A: 

Don't use the XML database .....,it's dangerous!

When more IO visit the xml,update or append new element,it's will lost data!

and when update the xml file,then power off,the data will lost ,and make the xml file to garbled

Roy
+1  A: 

XML is not a database. XML is a data interchange format, and using it as a database is generally an abuse.

XML files can't be easily updated and aren't (generally) indexed, so you can't do anything useful with them, database-wise (these are just two reasons why you should not abuse XML as a database).

So by all means use XML for small configuration files etc (which fit trivially in memory and can be manipulated by load-modify-save), but don't use it as a database for even small data.

MarkR