views:

65

answers:

3

this is my code

void fixInstellingenTabel(object source, ElapsedEventArgs e)
{
    NASDataContext _db = new NASDataContext();

    List<Instellingen> newOnes = new List<Instellingen>();

    List<InstellingGegeven> li = _db.InstellingGegevens.ToList();
    foreach (InstellingGegeven i in li) {
        if (_db.Instellingens.Count(q => q.INST_LOC_REF == i.INST_LOC_REF && q.INST_LOCNR == i.INST_LOCNR && q.INST_REF == i.INST_REF && q.INST_TYPE == i.INST_TYPE) <= 0) {
            // There is no item yet. Create one.
            Instellingen newInst = new Instellingen();
            newInst.INST_LOC_REF = i.INST_LOC_REF;
            newInst.INST_LOCNR = i.INST_LOCNR;
            newInst.INST_REF = i.INST_REF;
            newInst.INST_TYPE = i.INST_TYPE;
            newInst.Opt_KalStandaard = false;
            newOnes.Add(newInst);
        }
    }
    _db.Instellingens.InsertAllOnSubmit(newOnes);
    _db.SubmitChanges();
}

basically, the InstellingGegevens table gest filled in by some procedure from another server. the thing i then need to do is check if there are new records in this table, and fill in the new ones in Instellingens.

this code runs for like 4 minutes on 15k records. how do I optimize it? or is the only way a Stored Procedure?

this code runs in a timer, running every 6h. IF a stored procedure is best, how to I use that in a timer?

        Timer Tim = new Timer(21600000); //6u
        Tim.Elapsed += new ElapsedEventHandler(fixInstellingenTabel);
        Tim.Start();
+2  A: 

One way you could optimise this is by changing the Count(...) <= 0 into Any(). However, an even better optimisation would be to retrieve this information in a single query outside the loop:

var instellingens = _db.Instellingens
    .Select(q => new { q.INST_LOC_REF, q.INST_LOCNR, q.INST_REF, q.INST_TYPE })
    .Distinct()
    .ToDictionary(q => q, q => true);

(On second thought, a HashSet would be most appropriate here, but there is unfortunately no ToHashSet() extension method. You can write one of your own if you like!)

And then inside your loop:

if (instellingens.ContainsKey(new { q.INST_LOC_REF, q.INST_LOCNR,
                                    q.INST_REF, q.INST_TYPE })) {
    // There is no item yet. Create one.
    // ...
}

Then you can optimise the loop itself by making it lazy-retrieve:

// No need for the List<InstellingGegeven>
foreach (InstellingGegeven i in _db.InstellingGegevens) {
    // ...
}
Timwi
+1. Netagitve side, obviously, is that memory consumption goes up. Not a real negaitve unless this gets critical - so it does not stop my +1.
TomTom
+3  A: 

Doing this in a stored procedure would be a lot faster. We do something quite similar, only there is about 100k items in the table, it's updated every five minutes, and has a lot more fields in it. Our job takes about two minutes to run, and then it does updates in several tables across three databases, so your job would reasonably take only a couple of seconds.

The query you need would just be something like:

create procedure UpdateInstellingens as

insert into Instellingens (
  INST_LOC_REF, INST_LOCNR, INST_REF, INST_TYPE, Opt_KalStandaard
)
select q.INST_LOC_REF, q.INST_LOCNR, q.INST_REF, q.INST_TYPE, cast(0 as bit)
from InstellingGeven q
left join Instellingens i
  on q.INST_LOC_REF = i.INST_LOC_REF and q.INST_LOCNR = i.INST_LOCNR
  and q.INST_REF = i.INST_REF and q.INST_TYPE = i.INST_TYPE
where i.INST_LOC_REF is null

You can run the procedure from a job in the SQL server, without involving any application at all, or you can use ADO.NET to execute the procedure from your timer.

Guffa
this code is brilliant. it runs for 15k records in less than a second. i think about half a second. stunning speed :) Can you explain me how exactly this query works? cos i dont understand it quite right. is it like this?: Insert into the table,Select the 4 items from instellinGegevens, where the inst_loc_ref is null. i dont get what the left join does. why doesnt it give an error when you join those 2 tables when 1 of them doesnt have a corresponding value.
Stefanvds
edit: ok i get the left join. brilliant thinking:) thanks! http://www.w3schools.com/sql/sql_join_left.asp
Stefanvds
+1  A: 

What Guffa said, but using Linq here is not the best course if performance is what you are after. Linq, like every other ORM, sacrifices performance for usability. Which is usually a great tradeoff for typical application execution paths. On the other hand, SQL is very, very good at set based operations so that really is the way to fly here.

Wyatt Barnett