views:

92

answers:

3

I'm developing an app that will need to import data for three models from a single CSV file (with one-to-many associations). I have set up a Datafile model & controller to handle upload / parsing of the file. Right now, all the logic for parsing and saving the records is in the controller. This allows me to save to several different models, get the IDs for the saved records, and create associations as necessary while the file is being parsed.

Thinking about the "fat model, skinny controller" principle, I realized I have about 150 lines of code in a controller that is really just processing data. As I started looking at moving this to a model, though, I concluded I would have to process all of this data into arrays (without knowing association IDs), and send it back to the controller to be saved (since the model can't call methods from other models). I'm anticipating having about 1,500 records in the import file. I'm using CakePHP, which has a saveAll() method to save data to several models simultaneously from a single array.

Another option would be to have each of the three models separately parse the file, ignoring any data it doesn't need. This should be possible, as long as I send it to the models in the right order, and give the "belongsTo" models a list of possibly associated records to search.

So -- any advice on these options?

  1. Leave the parsing code in the Datafile controller as it is.
  2. Move all the parsing code to the Datafile model, then pass back a large array to be saved through the Datafile controller.
  3. Send the file separately to each of the three models, along with supplemental lists for determining associations.
A: 

Is this an operation that will only be performed once? If so, write a script to do it. Otherwise....

I'd import all of the data into the primary controller, construct the data array (in the same format as $this->data, then save it either with saveAll or individual save depending on which is more appropriate. So long as it's saved with the correct relations, it doesn't really matter how you do it!

Remember to do a $this->myModel->create() before each one to ensure that you don't just end up with one row.

Leo
I want the client to be able to do it multiple times if necessary (eg if they make corrections to the data file), but it's not something that would be done on a regular basis.
handsofaten
So you don't think there's any particular reason to put all this code in a model as opposed to a controller? I have it working the way I want right now, but I feel like I'm breaking the "skinny controller" principle.
handsofaten
+1  A: 

For import scripts I'd always prefer a shell script. A controller is probably the worst possible place to put complicated import logic, since it's almost impossible to reuse.

If you don't have a problem with running the script from the CLI, use a Shell. You can even invoke it from a web interface, if need be.
Otherwise, put the logic into the model, to be able to call it from both the web and the CLI.

deceze
Thanks deceze... I do want to use the web interface (or rather I want the client to be able to use it). It hadn't occurred to me to create a shell, which I haven't done in Cake before. I'm not seeing how you call a shell from a controller in the manual... how would you do this?
handsofaten
@hands You can just launch it like you would on the CLI using one of the [`system`](http://php.net/system) commands. But if you're planning to do that often, you should structure the code so you can simply call a function in PHP, i.e. put the logic into the model.
deceze
+1  A: 

I built something similar to this, however it was not model-specific. It allows you to upload and download csv sheets that use multiple models in the same sheet. I used a series of components that called the models I wanted to upload as CSV sheets.

I have a rawdata component that processes files line by line. On each line it does a callback to the controller for before_process_row and after_process_row, etc. The controller callback then calls another component to do the splicing up of the data into the different models based on rules.

The only information I have in the models are rules for what fields display in the csv file and the file header information and other search criteria.

It may be a little more complex than you're looking for, but I went with components for the data processing part.

Dooltaz
Thanks, this is a good idea. It does sound more complex than what I'm looking at, but worth keeping in mind.
handsofaten