views:

74

answers:

2

Succinct

How do I performance tune my Linq To SQL DAL methods? Specifically the amount of data being transferred.

Verbose

I have a Winform app that uses Linq To Sql to access it's data. We have 5 branches, 1 physically at the same location as the SQL server and the other 4 at various distances and bandwidths. My app is your typical Enterprise LOB CRUD app. You Search for a Person, Select them, and then as you select a control, say Demographics, it loads that person's Demographics and displays them on screen.

I only store the "key stone" ID's and then make DB calls using those Primary Keys. This works great in the Main location, with the SQL Server on site. However, the branches are experiencing severe delay loading some of the controls.

This is a fairly typical example(loading Person Details user control) of my DAL:

    public static DataTable GetGeneralPersonInfo(int personID)
    {
        using (var context = ConnectDataContext.Create())
        {
            var generalPersonInfo = from person in context.tblPersons
                                    where person.PersonID == personID
                                    join addresse in context.tblAddresses.Where(a =>a.AddressTypeID == 'm')  on person.PersonID equals
                                        addresse.PersonID 
                                    select
                                        new
                                            {
                                                person.PersonID,
                                                person.FirstName,
                                                person.MiddleName,
                                                person.LastName,
                                                person.SuffixID,
                                                person.TitleID,
                                                addresse.AddressLine1,
                                                addresse.AddressLine2,
                                                addresse.AddressLine3,
                                                addresse.CityName,
                                                addresse.StateID,
                                                addresse.ZipCode,
                                                addresse.ZipPlus,
                                                addresse.DirectionsToAddress,
                                                addresse.CountyID,
                                                person.ResidencyCountyID,
                                                person.ResponsibilityCountyID,
                                                person.EmailAddress,
                                                person.SSN,
                                                person.Gender,
                                                person.BirthDate,
                                                person.DeathDate,
                                                person.DriversLicenseNumber,
                                                person.DriversLicenseStateID,
                                                person.HispanicOriginFlag,
                                                person.CitizenFlag,
                                                person.VeteranFlag,
                                                person.MaritalStatusID,
                                                person.PrimaryRaceID,
                                                person.SecondaryRaceID,
                                                person.UpdateUserID,
                                                person.UpdateDateTime,
                                                person.InsertDateTime,
                                                person.InsertUserID,
                                            };

            return generalPersonInfo.CopyLinqToDataTable();
        }
    }

Anything glaringly wrong there?
To provide further disclosure, every table has a uniqueidentifier that is NOT the PK. Also, I have very wide tables(60+ columns someplaces) with large text fields(varchar(500 - max)).

+2  A: 

I'm going to guess this has something to do with your branches connection to the main location where the DB is. If you are getting decent performance at the main branch, but the offsite branch locations are all acting slowly, its probably got nothing to do with your DB query.

NickLarsen
Thanks, so you don't see anything glaringly wrong with the above? Our network team insists we have tons of bandwidth to the branches.
Refracted Paladin
As long as it is running fine at the main branch, the problem is somewhere in between. Queries don't just run slower because they recognize they are originating from an outside source. And your query is so basic that I doubt there is much optimization that can be done.
NickLarsen
+3  A: 

I only store the "key stone" ID's and then make DB calls using those Primary Keys

This is your problem.

Assume you load entities for 100 keys.

  • Local branch, 1ms latency. That is 100ms just for the network. 0.1 seconds - performance is "ok".

  • Remote branch, let's say 60ms latency. That is 6000ms just for the network. 6 seconds. And itonly goes down from there.

Old rule, teached to me 15 years ago: DO NOT WRITE CHATTY INTERFACES FOR TIER BOUNDARIES. Make as few calls as possible. You made an interface with as many calöls as possible.

The rest is caching (lare fields - how often do they change?).

TomTom
Are you saying I should pull down whole objects and store them? I think I am missing something here. You are saying that my 'real-time' delay is coming from the inherit network latency of opening and closing hundreds of connections, not from the amount of data pulled across the pipes?
Refracted Paladin
No, from the latency of transmitting SQL Commands and waiting for the answer. Every time you ask for an object by ID (not a list of objects by a list of ID's) the query travels to the server, the answer back. This delay relly ads up FAST in a remote connection. REwrite your usage to make as few calls to the database as possible. Start using a server side DAL if needed.
TomTom
My app is an Occasionally Disconnected App, therefore no server side DAL :(.
Refracted Paladin
I am going to try and restate what I believe you are saying for my own clarity: Right now I store PersonID, AddressID, PhoneID when you select a Person. Then when you go to a certain control it uses that stored ID(PK in DB) to call the DB and retrieve the necessary information. You are saying that is wrong, that I should call the entire Person Object(tblPerson) the first time?
Refracted Paladin
It depends. What I say is that when you make another 3 calls, through a slow network, the fact that you do 3 calls instead of 1 will take time. I once designed an ORM (sold and propietary for a long time) and the API there explicitely allowed me to bundle queries and submit them to the server side DAL in one round trip. Round trips introduce network latency. My office is connected to my hosting cluster through a latency of 100ms. That is SIGNIFICANT if it happens more often than needed. A lot more than transmitting 10kb more data in one run.
TomTom
I think I understand. Not sure what I can/will do about it but I think I finally understand. Maybe this is where some POCO's wolud come in handy....
Refracted Paladin
Thats a great pickup on the CHATTY DAL. Since the code for the list isn't posted, I figured it pulled the whole list at once, then when a `Person` is selected, it essentially loads a list of foreign key identifiers, which the code is listed for, then each item selected on the form makes a single call to load that entity. This doesn't seem all that CHATTY to me. @Refracted Paladin, how many calls does your database make for the list?
NickLarsen
He said he loads a list of primary keys. Then, upon needs, he loads the objets b y PK. Sounds like a perfect chatty dal to me.
TomTom
@NickLarsen: Example flow --> 1 call to get search results, 1 call to get the PK's (PersonID, AddressID, PhoneID, etc), when loading a specific UserControl it uses the appropriate PK from the second step to pull down a "row" of data to populate the control. So from search to population of the detail control, 3 db trips wrapped in `using`
Refracted Paladin
Ok, now - what is the ping dime between the outer locations and the central database server? This can be significant - and it can make things SLOW ;)
TomTom
Aye, good answer @TomTom.
NickLarsen
this ping averages 40ms and 10% loss -->`ping -f -l 1024 wwcsql -t`
Refracted Paladin
Are you joking? Ok, 40ms is great. 10% packet loss, though, will KILL EVERYTHING. Packet loss should be LOW. 2% is high. 10% is a desaster. Any TCP based setup will be dead slow, with every 10th packet needing recovery (60 seconds timeout).
TomTom