tags:

views:

27

answers:

3

So here is my situation: I have a vendor supplied DB we cannot modify and a custom db that imports data from the vendor app and acts on it. Once records are imported form the vendor app, they cannot appear on the list of records to be imported. Also we only want to display the 250 most recent records that have not been imported.

What I originally started with was select the list of ids that have been imported from the custom db, and then query the vendor db, using the list of ids in a .Where(x => !idList.Contains(x.Id)) clause on the remote query.

This worked up until we broke 2100 records imported into the custom db, as 2100 is the limit on the number of parameters that can be passed into SQL. After finding out this was the actual problem and not the 'invalid buffer'/'severe error' ADO.Net reported, my solution was to remove the first 2000 ids in the remote query, and then remove the remaining records in the local query.

Having to pull back a large number of irrelevant records, just to exclude them, so I can get the correct 250 records seems very inelegant. Is there a better way to do this, short of doing a cross db stored procedure?

Thanks in advance.

A: 

This might not be the best answer, depending on how many records you're dealing with, but you could force the SQL to execute and just deal with it as in-memory objects. Calling the ToList() method will execute the SQL and convert to an IEnumerable .

Robaticus
right, that is probably the easiest way, but seems wasteful. My case is 20k records right now, but this number grows daily. Probably not a big deal until you hit a few hundred k records, but I'm unsure as the vendor db is constantly in use by a vendor application, and there is no good way to simulate daily usage.
tap
You could also import the keys into a table on your database and do the join there. either on demand or periodically.
Robaticus
A: 

What I might suggest is to have started by querying the vendor database first ordering the results by some kind of criteria (perhaps a date field, oldest to most recent).

You could do a Skip().Take() to "skim" the results and then take each bulk set and insert them into the custom db where the ID doesn't already exist. That way you avoid the problem you have now.

RobS
ya this was something I considered, however I did not want the clutter or the hassle with syncing the 2 dbs, since the vendor db is updated on a daily basis.this is a viable solution for some situations, but not mine.
tap
A: 

If you have db-create access to the SQL Server that the vendor's db is running on (or if your custom db is on the same server), you could create a "has been imported" table in a different database on that same server, and then write a stored proc that does a cross-database join of that table against the vendor db, e.g.:

select top 250 from vendordb.to_be_imported
where not exists 
(select 1 from customdb.has_been_imported where idWasImported = idToBeImported)
order by whatever;

You might even be able to do this in Linq 2 SQL -- I've never tried adding objects from different databases into a single DataContext...

Ben M
This could be the best solution where performance is important (again my situation, I'm not able to test). But, not really what I was asking for as I was wondering if there is a way to do it without cross-db SP.
tap
@tap oh yeah, so you did -- sorry. By the way, which version of SQL Server are you using? 2005 or 2008?
Ben M