views:

180

answers:

3

Hello.

I have a staging database which stores the GEO location as the following structure :
Countries
Regions
Cities
Postal Code
Longtitude
Latitude

I am getting the data from a vendor. The name is not relevant. The data comes in a CSV file, the columns are
Start IP
End IP
Country
Region
City
Postal Code
Longtitude
Latitude

The fact that the data comes in the CSV file tends to change, such as the Postal Code, City Name, Region Name, or the IP Range.

The way part of the application that handles the data import work as follows: Delete Countries, Regions, Cities, Postal Code etc. And Re populate the data into the Database.

I need a better way to this. Because when the application is live I will be losing the keys of this entries which are in the database. and yet this importing takes about 2 minutes, which means the application wont be able to do use GEO Location db. So i cant really use delete and insert.

I need to implement this such a way that, I will load all the data into memory and match it exactly as the structure of when I read the data from the list. i.e: Country Dictionary, Region Dictionary from DB and CSV file. and the detect the changes and update the database in one transaction.

The problem is: How to map them, so that I can detect changes. In other words, for example : If the Country name changes :), I need to update that with whatever the country name changed to in the CSV file. Ok. But how about more than 1 country name changes ? Same idea for regions, cities, postal codes.

Yes, I store this ad a Tree Structure. Country is the root node, Regions are first level childs, Cities are seconds level and Postal codes are the cherries.

Any ideas?

I m sorry. This was kinda long to explain. Appreciate the time you took to read through this.

+1  A: 

Perform a diff on CSV files and use that to craft SQL that will update the database.

Ben S
hmm. I ll think about that. Thank you.
It's far from a simple solution, but it will yield the best results. If you can find a way of automating reading which columns have been updated and which rows are new or removed generating the SQL would be fairly simple.
Ben S
I can detect it from Longitude and Latitude, of course if they never changes. and we dont use SQL nor ORM. it s something else :)
A: 

if you want to do an update, you can use RedGate SQL Comparer (MS SQL). It is very good, will give you the scripts. There are also other tools which can do this sort of comparison. The steps would be

  1. set your database to single user mode, run the updates, release it from single user mode

Alternative: upload this into a new table name, drop the old table and rename this new table to the original table name. Ofcourse you would have to handle any primany key foreign key relationships as well

ram
I m sorry but I m not asking this for MS SQL or any other database.
So what is your DB?
ram
ohh I see, you want to do it from ur application?
ram
I use an Object Oriented Database. So there is no SQL Syntax. Only Objects. And yes I want to do it from application.
+1  A: 

We have many years of experience using the geolocation database. We are using the IP2Location from http://www.ip2location.com.

It is unlikely you can update the database by keeping the Start IP and End IP as index for reference purpose.

For example, one IP address range could be splitted into multiple ranges in the next releases.

Therefore, we are importing every month update as different table. Therefore, we can perform a comparison look up by using two tables.

Tim
Thanks for the response, we use another company , and they provided us a file format that we can update the db easily.