views:

41

answers:

3

Hi,

A customer has a web based inventory management system. The system is proprietary and complicated. it has around 100 tables in the DB and complex relationships between them. it has ~1500000 items.

The customer is doing some reorganisations in his processes and now has the need to make massive updates and manipulation to the data (only data changes, no structural changes). The online screens do not permit such work, since they where designed at the begining without this requirement in mind. The database is MS Sql 2005, and the application is an asp.net running on IIS.

one solution is to build for him new screens where he could visialize the data in grids and do the required job on a large amount of records. This will permit us to use the already existing functions that deal with single items (we just need to implement a loop). At this moment the customer is aware of 2 kinds of such massive manipulations he wants to do, but says there will be others.This will require design, coding, and testing everytime we have a request.

However the customer needs are urgent because of some regulatory requirements, so I am wondering if it will be more efficient to use some kind of mapping between MSSQL and Excel or Access to expose the needed informations. make the changes in Excel or Access then save in the DB. may be using SSIS to do this. I am not familiar with SSIS or other technologies that do such things, that's why I am not able to judge if the second solution is indeed efficient and better than the first. of course the second solution will require some work and testing, but will it be quicker and less expensive?

the other question is are there any other ways to do this?

any ideas will be greatly appreciated.

A: 

I doubt Excel will be able to deal with 1.5mil elements/rows.

When you say to visualise data in grids - how will your customer make changes? Manually or is there some automation behind it? I would strongly encourage automation (since you know about only 2 types of changes at the moment). Maybe even a simple standalone "converter" application - don't make part of the main program - it will be too tempting for them in the future to manually edit data straight in the DB tables.

DmitryK
the changes will be on subsets of data, not the 1.5M.I thought about manual changes in the grid with a submit function that will call a loop on the server to call the already existing functions.indeed, I am not willing to make this functionality as part of the application, that's why I am not very excited by the grid and the online screen.
albert green
A: 

Either way, you are going to need testing.

Say you export 40000 products to Excel, he re-organizes them and then you bring them back into a staging table(s) and apply the changes to your SQL table(s). Since Excel is basically a freeform system, what happens if he introduces invalid situations? Your update will need to detect it, fail and rollback or handle it in some specified way.

Anyway, both your suggestions can be made workable.

Personally, for large changes like this, I prefer to have an experienced database developer develop the changes in straight SQL (either hardcoding or table-driven), test it on production data in a test environment (doing a table compare between before and after) and deploy the same script to production. This also allows the use of the existing stored procedures (you are using SPs, to enforce a consistent interface to this complex database, right?) so that basic database logic already in place is simply re-used.

Cade Roux
Yes, if we use Excel we will have to implement validation logic. but I heard about those SSIS and the possibilities to add logic. what about mapping with Access tables and having basic forms that expose the required data?no I am not using SPs. All the logic is in controllers running on the server.Even if we involve a database expert, how could we expose things to the customer? he must not have to deal with technical details
albert green
Do you have any referential integrity in your database?
Cade Roux
yes the database contains referential integrity. but we do not have business logic running on the DB server. it is in the application server.
albert green
If your business objects are in assemblies, I would look at automating them from a separate batch console or windows forms application. As far as the interface, to be honest, I think large batch operations are usually too important to be left to end users in the first place. Often I drive changes like this with database tables. The amount of time you can be spending on developing a bullet-proof interface could be better spent doing and testing different categories of batch updates. I know this doesn't sound like a typical outside developer relationship.
Cade Roux
Yes your are right Cade Roux, I will investigate more with your approach. I find your answer usefull, but I don't have enough reputation to vote. hope someone will do it. thanks
albert green
A: 

Here is a strategy that I think will get you from A to B in the shortest amount of time.

one solution is to build for him new screens where he could visialize the data in grids and do the required job on a large amount of records.

It's rarely a good idea to build an interface into the main system that you will only use once or twice. It takes extra time and you'll probably spend more time maintaining it than using it.

This will permit us to use the already existing functions that deal with single items (we just need to implement a loop)

Hack together your own crappy little interface in a .NET Application, whose sole purpose is to fulfill this one task. Keep it around in your "stuff I might use later" folder.

Since you're dealing with such massive amounts of data, make sure you're not running your app from a remote location.

Obtain a copy of SQL 2005 and install it on a virtualization layer. Copy your production database over to this virtualized SQL server. Take a snap shot of your virtualized copy before you begin testing. Write and test your app against this virtualized copy. Roll back to your original snap shot each time you test. Keep changing your code, testing, and rolling back until your app can flawlessly perform the desired changes.

Then, when the time comes for you to change the production database, you can sit back and relax while your app does all of the changes. Since the process will likely take a while, so add some logging so you can check the status as it runs.

Oh yeah, make sure you have a fresh backup before you run your big update.

James Jones
James Jones, I am sure I understand you completely with this virtualization layer approach. I am not familiar with this. could you please provide more details or a reference for me to gather more info.
albert green
Virtualization is a large concept that I can't fit into 600 chars, so I will focus on the area that applies to this situation. Where I work, I use a program called VMWare Workstation. With it, I am able to run almost any OS within my currently running OS (which happens to be Windows 7). It also lets me take "snap shots", which are full copies of the entire operating system. I am able to "roll back" to any of these snap shots whenever I like. This makes it extremely useful for testing purposes. I don't know what I would do without it.
James Jones
oh I know what virtualization is and I am using it too. May be I have not express myself correctly, sorry about that. What I did not understand is how virtualization could solve the problem. I understand that it provides a dev/test environement, but with or without it I still have to choose a solution.
albert green