views:

182

answers:

6

I'am developing a software based on Entity Framework to handle data in a MS SQL Server 2008 database.

[Trouble 1]

I've just tried to insert some small data (about 2 Mb) from my progam to the database : the performance are very bad ! It takes more than 1 minute to insert these datas !

I've try to generate pre-compiled views, I've got the same results :-(

All my code use a business layer (automatically generated from .edmx file with T4 template) to manage data in a service layer. It is very pratical to navigate in the relations of objects.

How can I improve the performance of these inserts with Entity Framework ?

[Trouble 2]

Also, before inserting data in database with SaveChanges() method, I fill my object context with AddObject() method. I add about 100 000 small objects (about 2 Mb) to my object context with AddObject() : it takes a very long time (more than 10 minutes) !

How can I decrease this time ?

UPDATE

My program must save more than 50 Mb in database in less than 2-3 minutes ? Do you think it will be possible with EF ?

+1  A: 

You can profile the queries using SQL Server Profiler and then use the Database Engine Tuning Advisor to determine better statistics and indexes to add.

John Hartsock
AFAIK, adding indexes decreases insert performance
Andomar
@Andomar, you don't know what the problem is. If it's a slow query in an `AFTER INSERT` trigger, adding an index might indeed make the insert faster. "Profile it and find the problem" is the only correct answer here. There's not enough information in Patrice's question to say anything more.
Craig Stuntz
A: 

Chances are that it is not Entity Framework but your database schema that is the culprit, e.g. bad clustered indices or too many indices.

You can see the SQL generated by Entity Framework by using the ObjectQuery.ToTraceString property.

If the SQL generated by Entity Framework isn't satisfactory for your needs you are able to utilize your own stored procedures for create, update and delete operations.

Martin Liversage
In fact, the process of adding new object in context with method .AddObject() seems to be very long as well :-(.
Patrice Pezillier
IMHO, even the most horrible schema design could not explain a couple of inserts taking a minute to complete
Andomar
A: 

I am experiencing a similar issue with my current project. I attached to SQL Profiler and kicked off my process. It appears that the Entity Framework kicks off an SQL transaction for every INSERT or UPDATE each time changes are saved.

I'm not sure if that has anything to do with the issue, but I would run SQL Profiler and examine the results. I would also monitor your disk usage (reads and writes) on your SQL box (if possible) and see if there are any red flags.

James Smith
Inserts are always part of a transaction. If Entity Framework didn't start a transaction for each insert, SQL Server would start one automatically. In default mode, the SQL Server started transaction would commit immediately after the insert. That's called implicit transactions.
Andomar
Ah okay. Thanks for the information. So if there are a large amount of INSERT or UPDATE operations taking place, would this have anything to do with the performance? Do SQL transactions add significant performance overhead?
James Smith
@James Smith: SQL transactions are required, so technically they don't add overhead, they're just part of the insert. For big imports, check out bulk inserts.
Andomar
I don't know much of EF but shouldn't you explicitly start transaction before starting large insert or update operation? That will save SQL Server to create a new transaction for every row and that will considerably speed up your operation.
Pradeep
@Pradeep - That's what I was driving at. If there is a large data set (let's just say >= 1000 records) isn't there a performance hit if EVERY SINGLE row runs its own transaction?
James Smith
+3  A: 

You could use the Entity Framework Profiler to check what SQL is being generated. This tool has a 30 day free trial. It also claims that it can do "Analysis and detection of common pitfalls when using Entity Framework".

There is also this article on EF tuning

Edit

Based on your edits EF is not the way to go. You need to do a bulk insert in order to get the data in that fast. Have a look at this link where I helped someone reduce load time from 10 hours to 6 mins.

Shiraz Bhaiji
@Shiraz: Can I mix bulk insert with EF ?
Patrice Pezillier
@Patrice, it does not look like that is possible, see: http://stackoverflow.com/questions/1609153/how-to-do-a-bulk-insert-linq-to-entities
Shiraz Bhaiji
@Shiraz, in EF 4 you can execute "native" SQL server statements via [ObjectContext.ExecuteStoreCommand.](http://msdn.microsoft.com/en-us/library/system.data.objects.objectcontext.executestorecommand.aspx)That is probably the best way to do bulk loading. But I still wonder about the source. If it were, e.g., CSV files, a SSIS package might be the best fit.
Craig Stuntz
@Shiraz,@Craig: I need to do bulk *insert* and loading from memory to database.Can I directly use ObjectContext.ExecuteStoreCommand to bulk loading ?What about bulk insert ?
Patrice Pezillier
Patrice, it seems to me that your issues are bigger than one performance point. You need an architectural review. I think that's more than can be done in one SO question. One place to start might be this article http://cacm.acm.org/magazines/2010/6/92486-managing-scientific-data/fulltext Although it deals with somewhat larger data scales, it discusses the issues involved.
Craig Stuntz
@Craig - That is my guess, too (architectural issues). I am wondering how the database is structured. I have used EF for large amounts of data and have never had this type of problem.
JasCav
+1  A: 

There are several possibilities here.

  1. The database hardware might not be up to the task of handling 100,000 inserts. How many tables are involved? Are there ancillary considerations such as triggers that are firing? Is the database memory constrained?

  2. The Web server hardware might not be up to the task of processing that much load. Where is the data originating? How long is it taking to transfer to the web server? How many inserts/sec is the web server actually sending to the database server?

To sum up, you have to profile to figure out exactly where the bottlenecks are. With the information you've given so far it could be anywhere.

You need to run a profiler on the web server, and you need to use SQL Profiler on the database server. Additionally, you should be monitoring both machines CPU, memory, and network usage graphs while loading the data.

Once you have all of that data you should be able to pinpoint where the problem is.

UPDATE
As a side note, EF has to create at least 100,000 objects to hold the data that you are uploading (one for each record). This has it's own overhead which is why ORM's are usually not a good idea for large scale inserting/updating data.

UPDATE 2
If you are running both the client and the database server on the exact same desktop machine then you are likely to have hardware issues. Hopefully you have a multi-core system with at least 4 GB of ram. Everything on your system is going to be competing for resources: visual studio, SQL Server, the app itself, plus whatever else you happen to be running. Studio and SQL Server are memory and CPU hogs. (FYI - SQL server doesn't report everything to task manager)

Point is, unless you are deploying the app with an embedded sql server this isn't going to come close to being a real world test and the performance you are seeing or lack thereof has no relationship to a real world scenario.

Chris Lively
1. I'm talking about 2 Mb of data in only 1 table (20 bytes by row). There are no triggers. How can i see if the database memory is constrainted ? This point can explain the very slow AddObject() ?2. SQL Server is on localhost3. My program is not a web application. It is an heavy client in c#.
Patrice Pezillier
@Patrice: pull up task manager and go to the performance tab. When executing the upload, what happens? Is there memory available or is your disk thrashing? Also, if you are running this in debug mode, don't. Run it in release mode.
Chris Lively
memory is available
Patrice Pezillier
@Patrice: Then pull out the SQL Profiler and watch the transactions coming across to figure out how it is impacting the rest of the system.
Chris Lively
@Chris: Ok for insertion but how can i reduce the AddObject time ?
Patrice Pezillier
@Patrice: Optimizing EF is a little outside my capabilities. I don't use it because of things like this. My approach to this problem would be to either use a stored procedure that accepts a table-valued parameter containing all of the data I wanted to insert OR to use some type of bulk loading capability.
Chris Lively
+1  A: 

In the original question, I get the impression that 100,000 or so objects are all created at once and a single call to SaveChanges is made. This kind of large unit of work where there's a single call to SaveChanges is almost always a very bad performance idea for the Entity Framework. I would try dividing things up into a series of batches with maybe 100 objects or so and call SaveChanges between each batch. In fact I would even dispose of the context and create a new one between batches because otherwise you end up with all of those objects in the state manager at once which imposes overhead. You would probably need to experiment with different batch sizes to find the best number.

Also, you might want to consider what type of entities you are using. If you use poco entities there is some overhead detecting changes which isn't present when you use classes which implement IEntityWithChangeTracking (or inherit from EntityObject which implements that interface).

  • Danny
Danny simmons