views:

808

answers:

3

All the PHP files in my workspace are encoded in Unicode (UTF-8, no BOM). I often duplicate an existing source file to use as a base for a new script. Invariably (with Path Finder or the original Finder), OS X will convert the encoding of the duplicate file to Western (Mac OS Roman).

Is there any way to make OS X behave and not convert the text encoding when duplicating a text file? Or make it use a specific text encoding (other than Western!) by default for all files with .php extension?

+2  A: 

I highly doubt that the "conversion" you're seeing is an actual difference in the two copies of the file. OS X doesn't store any sort of encoding value for a file's content, so it's much more likely that it's whatever tool you're using to view/edit your files that's interpreting the content differently, whether it be Xcode, TextWrangler, or whatever other editor you're using. Doing a straight copy of the file, whether it be from the Finder, Path Finder, or the command line, won't ever change the actual data in the file in the process of copying.

UTF-8 and Mac OS Roman are actually the same if you don't have any characters above 0x7f, so it's very possible that a valid UTF-8 file with no BOM could be interpreted as a Mac OS Roman file. In any case, I would look into whatever program it is that's showing you that encoding to find the solution, not the file copying procedure of OS X itself.

Brian Webster
Thanks for clearing that up! Does linux behave the same way in terms of text file encoding information (not associating it with the content)?
Gilles
And by the way it was BBEdit and by digging in I saw it switched to Roman whenever it "couldn't guess" the actual encoding of the file.
Gilles
AFAIK, no operating system (Linux included) takes responsibility for determining the encoding of a text file, leaving that to individual applications to determine.
Brian Webster
A: 

Are you using the Finder to copy your files, or some other tool (an IDE)?

It's possible your original file has some extended Finder metadata in its Resource Fork that's being lost in the copying, if you don't use Finder to copy it.

-W

Wil Shipley
+1  A: 

Please note that there is indeed an extended attribute in OS X that stores the file's encoding for text files produced through -[NSString writeToFile:...] or similar facilities, which may not be getting copied along with the file.

millenomi