views:

504

answers:

5

We have a bit of a messy database situation.

Our main back-office system is written in Visual Fox Pro with local data (yes, I know!)

In order to effectively work with the data in our websites, we have chosen to regularly export data to a SQL database. However the process that does this basically clears out the tables each time and does a re-insert.

This means we have two SQL databases - one that our FoxPro export process writes to, and another that our websites read from.

This question is concerned with the transform from one SQL database to the other (SqlFoxProData -> SqlWebData).

For a particular table (one of our main application tables), because various data transformations take places during this process, it's not a straightforward UPDATE, INSERT and DELETE statements using self-joins, but we're having to use cursors instead (I know!)

This has been working fine for many months but now we are starting to hit upon performance problems when an update is taking place (this can happen regularly during the day)

Basically when we are updating SqlWebData.ImportantTable from SqlFoxProData.ImportantTable, it's causing occasional connection timeouts/deadlocks/other problems on the live websites.

I've worked hard at optimising queries, caching etc etc, but it's come to a point where I'm looking for another strategy to update the data.

One idea that has come to mind is to have two copies of ImportantTable (A and B), some concept of which table is currently 'active', updating the non-active table, then switching the currenly actice table

i.e. websites read from ImportantTableA whilst we're updating ImportantTableB, then we switch websites to read from ImportantTableB.

Question is, is this feasible and a good idea? I have done something like it before but I'm not convinced it's necessarily good for optimisation/indexing etc.

Any suggestions welcome, I know this is a messy situation... and the long term goal would be to get our FoxPro application pointing to SQL.

(We're using SQL 2005 if it helps)

I should add that data consistency isn't particularly important in the instance, seeing as the data is always slightly out of date

+1  A: 

You should be able to maintain one db for the website and just replicate to that table from the other sql db table.

This is assuming that you do not update any data from the website itself.

cfEngineers
We do update the data from the website, but those updates don't need to be maintained once the new data is transferred... it's complicated!
Hainesy
How do you replicate tables?
Hainesy
A: 

Do you have the option of making the updates more atomic, rather than the stated 'clear out and re-insert'? I think Visual Fox Pro supports triggers, right? For your key tables, can you add a trigger to the update/insert/delete to capture the ID of records that change, then move (or delete) just those records?

Or how about writing all changes to an offline database, and letting SQL Server replication take care of the sync?

[sorry, this would have been a comment, if I had enough reputation!]

Geoff
What's the score with replication? Will SQL handle this in terms of performance and not causing problems for live DB?
Hainesy
I don't have enough heavily-loaded experience with it, so you'd need to experiment in your scenario (but I think this is definitely worth doing). Since you're using SQL Server, you also have other options (load balancing between two servers for example).
Geoff
+2  A: 

There are a lot of ways to skin this cat.

I would attack the locking issues first. It is extremely rare that I would use CURSORS, and I think improving the performance and locking behavior there might resolve a lot of your issues.

I expect that I would solve it by using two separate staging tables. One for the FoxPro export in SQL and one transformed into the final format in SQL side-by-side. Then either swapping the final for production using sp_rename, or simply using 3 INSERT/UPDATE/DELETE transactions to apply all changes from the final table to production. Either way, there is going to be some locking there, but how big are we talking about?

Cade Roux
Thanks, you seem to be suggesting along the lines of what I was thinking.The only question in my mind is if this is an hourly job, are we going to cause problems for our websites when doing the third stage.I think I will look at the INSERT/UPDATE/DELETE approach...
Hainesy
You shouldn't be talking about much change between the final and production, so I would probably wrap it in one big transaction. UPDATE where different, INSERT where missing, DELETE where missing (the other way). Your big task is the transform and that's not interfering with anyone.
Cade Roux
A: 

Based on your response to Ernie above, you asked how you replicate databases. Here is Microsoft's how-to about replication in SQL2005.

However, if you're asking about replication and how to do it, it indicates to me that you are a little light in experience for SQL server. That being said, it's fairly easy to muck things up and while I'm all for learning by experience, if this is mission critical data, you might be better off hiring a DBA or at the very least, testing the #$@#$% out of this before you actually implement it.

GregD
I'm no DBA that's for sure. To be fair, how many developers have experience of SQL server replication? It doesn't strike me as being a trivial or common operation... certainly not for the kind of thing I'm looking for.
Hainesy
I've done systems that included replication - but always with a bit of DBA(ish) backup for any sticky real technical issues. (mainly when replication jobs go down/fail etc...).
kpollock
+1  A: 

"For a particular table (one of our main application tables), because various data transformations take places during this process, it's not a straightforward UPDATE, INSERT and DELETE statements using self-joins, but we're having to use cursors instead (I know!)"

I cannot think of a case where I would ever need to perform an insert, update or delete using a cursor. If you can write the select for the cursor, you can convert it into an insert, update or delete. You can join to other tables in these statements and use the case stament for conditional processing. Taking the time to do this in a set -based fashion may solve your problem.

One thing you may consider if you have lots of data to move. We occassionally create a view to the data we want and then have two tables - one active and one that data will be loaded into. When the data is finsihed loading, as part of your process run a simple command to switch the table the view uses to the one you just finshed loading to. That way the users are only down for a couple of seconds at most. You won't create locking issues where they are trying to access data as you are loading.

You might also look at using SSIS to move the data.

HLGEM
Thanks. I recognise cursors are an avoid-at-all-costs strategy, but we're having to do a lot of extra work on each update/insert that makes it the quickest and possibly only viable solution. Unfortunately, users being down for even a second is not an option really.
Hainesy
Good idea on using a view to switch the which table is being looked at though... is it necessary to drop the view first or can it just be updated?
Hainesy
look at ALTER VIEW statment in BOL
HLGEM