views:

412

answers:

3

I am currently using symfony 1.4 and would like to allow users to upload Microsoft Word docx files. Using the sfWidgetFormInputFile widget and sfValidatorFile below users are able to select and successfully upload their docx files using a simple web form.

$this->widgetSchema['file_name'] = new sfWidgetFormInputFile(array('label' => 'File'));

$this->validatorSchema['file_name'] = new sfValidatorFile(array(
  'required'   => true,
  'path'       => sfConfig::get('sf_upload_dir').DIRECTORY_SEPARATOR.sfConfig::get('app_dir_file_sharing').DIRECTORY_SEPARATOR,
  'mime_types' => array('application/msword',
                        'application/vnd.ms-word',
                        'application/msword',
                        'application/msword; charset=binary')
), array(
    'invalid'    => 'Invalid file.',
    'required'   => 'Select a file to upload.',
    'mime_types' => 'The file must be a supported type.'
));

The problem is that after the file is uploaded, the extension is changed to .zip and the file contains a file tree of xml files. My understanding is that this is because Office 2007 are now using Open xml file formats. Is there any way to prevent this from happening using symfony or PHP?

+4  A: 

It seems to be a bug in Symfony's file type detection. A workaround is described.

littlegreen
+4  A: 

The problem is Content-Sniffing. The new Office formats ARE .zip files, and if on upload, the content is sniffed, the browser will identify this as a ZIP file and set the Content-Type header as such. Similarly, on download unless your server sets the proper Content-Type HTTP response header, the browser will assume that this is a ZIP file.

EricLaw -MSFT-
Yes, the new Office 2007 formats are XML files that have been zipped.
TravisO
+2  A: 

Symfony 1.3+ has an option mime_type_guessers for sfValidatorFile which allows you to define your own mime type guesser PHP callable or use a build in guesser. Calling any of the 3 built-in mime type guessers finds the correct file type for docx and keeps the the docx file extension.

Here is the updated code using guessFromFileinfo:

$this->validatorSchema['file_name'] = new sfValidatorFile(array(
'required'   => true,
'path'       => sfConfig::get('sf_upload_dir').DIRECTORY_SEPARATOR.sfConfig::get('app_dir_file_sharing').DIRECTORY_SEPARATOR,
'mime_type_guessers' => array('guessFromFileinfo'),
'mime_types' => array('application/msword',
                    'application/vnd.ms-word',
                    'application/msword',
                    'application/msword; charset=binary')
), array(
    'invalid'    => 'Invalid file.',
    'required'   => 'Select a file to upload.',
    'mime_types' => 'The file must be a supported type.'
));
markb