Goal: Using a CRON task (or other schedule event) update rails database with nightly export of data from an existing system.
All data is created/updated/deleted in an existing system. The website does no directly integrate with this system, so the rails app simply needs to reflect the updates that appear in the data export.
I have a .txt
file of ~5,000 products that looks like this:
"1234":"product name":"attr 1":"attr 2":"ABC Manufacturing":"2222"
"A134":"another product":"attr 1":"attr 2":"Foobar World":"2447"
...
All values are strings enclosed in double quotes ("
) that are separated by colons (:
)
Fields are:
id
: unique id; alphanumericname
: product name; any character- attribute columns: strings; any character (e.g., size, weight, color, dimension)
vendor_name
: string; any charactervendor_id
: unique vendor id; numeric
Vendor information is not normalized in the current system.
What are best practices here? Is it okay to delete the products and vendors tables and rewrite with the new data on every cycle? Or is it better to only add new rows and update existing ones?
Notes:
- This data will be used to generate
Orders
that will persist through nightly database imports.OrderItems
will need to be connected to the product ids that are specified in the data file, so we can't rely on an auto-incrementing primary key to be the same for each import; the unique alphanumeric id will need to be used to joinproducts
toorder_items
. - Ideally, I'd like the importer to normalize the Vendor data
- I cannot use vanilla SQL statements, so I imagine I'll need to write a
rake
task in order to useProduct.create(...)
andVendor.create(...)
style syntax. - This will be implemented on EngineYard