views:

200

answers:

4

I have been tasked with developing a new retail e-commerce storefront for my current job, and I am considering tackling it with RoR to A) Build a "real" project with my limited Rails knowledge, and B) Give management quick turnaround and feedback (they are wanting to get this done ASAP and their deadlines are rather unrealistic - I'm talking a couple of weeks to go from nothing to working model so they can start to market it with SEO/SEM and, I kid you not, "video blogging" because my boss heard that's the future).

We do have a database structure in place but it's absolutely terrible and was thrown together without rhyme nor reason, so I'm going to largely ignore it and create a new database from scratch; however, I have existing data that I need to load into the application (like I said, it's an e-commerce app and we have the product data). I need to massage this data into a usable format because our supplier provides it to us with cryptic, abbreviated column names and it's highly denormalized, especially in the categories (I've posted a question regarding it before - basically the categories table has six fields, one for each category/subcategory, with some of them being blank if that category doesn't apply).

There are two main issues that are giving me second thoughts:

1) As I said the data needs to be put into a "proper" database schema; I can't just load it as-is. I have some thoughts as to a good data model for it, but my analysis is not completed yet. There would end up being a large amount of joins tables to link various things together (e.g. products_categories, products_attributes, products_prices) etc and these tables would link products not via an ID but by their SKU (see below).

2) Everything already has an ID that's generated for it, but anything new I add needs to have one autogenerated; I doubt this will be a problem with any mature RDBMS, but I know Rails likes to generate IDs itself. Also, almost all of the product-related tables are linked by SKUs (and in the data provided by the supplier are actually a composite key consisting of the prefix and stock number, which combined make up the full SKU), not by IDs and I'm not sure if this will be a performance issue (of course, I could always manually create indexes on these columns to speed it up). It does mean that I'll need to break away from the Rails conventions, however.

In short, I think that Rails might be a good choice as far as time-to-market and ease-of-development, but having to work with the existing data content might turn into a pain because the application will need to be developed around that, instead of the "traditional" Rails app, and that factor is giving me major doubts about using Rails. There are also some other issues (having to set up a Linux server, and the fact that the area I live in has very few Rails developers so if I left the company I'd basically be holding them hostage as far as updates/modifications). I'm really unsure as to the best path to proceed.

+6  A: 

I would develop the app as if you didn't have the data. Use the ORM and make your database the best it can be, but of course keep in mind what data you have to populate it with (eg: don't make crazy new constraints for things that will leave you going through old data record by record).

When you're done and tested, write an import script that pulls your real data onto your new database.

It's not that different from the conventional design/development model... Apart from you can do your data-input in a semi-automated fashion.

Oli
+1  A: 

I was in the same situation not too long ago — a crappy PHP app that held ten years worth of all company data.

What I did was simply create a Migration model and added methods to import each resource.

class Migration
  def migration_all
    self.jobs
  end

  def self.jobs
    ...
  end
end

The cool thing about this is that you can arrange which order resources are imported as one will likely reference another. I also added methods that directly modified the db schema. One nice trick if you have to keep an existing primary key is to create a field named 'legacy_id', copy over your existing primary key, and when you're done, simply remove the 'id' field, rename the 'legacy_id' field to 'id', then add the primary_key constraint on the new 'id' field.

Matt Darby
+1  A: 

Don't use the SKU as the unique key for each product - use the standard Rails incremented id.

SKU could change as it may be misentered, etc and that would make it a nightmare to change all of the references from other tables. Put your current id in a sku column, index it and update the references in your other tables to the Rails ids.

You'd be able to do Product.find_by_sku(params[:sku]) in your controllers, set up a /products/:sku route, etc. I don't see what you'd gain (other than a headache) by using your non generated ids as the database primary keys.

RichH
Well, the data does have a unique "Product ID" as well as the SKU, but all of the additional info tables do the composite key based off of the SKU, so it would take a bit of work (maybe?) to retrofit them to use the regular ID column.
Wayne M
+1  A: 

I'd also suggest running your old data through your app's validations to make sure you are not loading up a bunch of inconsistencies and erros. It will help your app run smoothly and highlight existing data errors at a point where you can fix them.

Don't assume the existing data is valid just because it is already there.

srboisvert