views:

1134

answers:

1

I have edited several files ISO-8859-15 encoded php source files with netbeans 6.7.1, but it converted them (without asking me!!!!) to utf-8,and I lost several german characters in that process...

I'm looking for a tool to find all the utf8 encoded files inside a directory (It's hard for me to tell which file has been broken).

I'd also need a tool to translate them

I'm trying to fix the whole thing with gedit, which recognizes and respects the charset of each file, but won't let me save utf8 files as iso-8859-15, because it says there characters that won't be converted...

so, I need:

  • a tool o search for utf8 encoded files

  • an editor that allows me to go from one encoding to another

  • oh yes!, a way to tell netbeans not to mess with my files!!!

(i have already tried with editing /etc/netbeans.conf and adding -J-Dfile.encoding=UTF-8, or -J-Dfile.encoding=ISO-8859-15 with no luck) http://wp.uberdose.com/2007/05/07/netbeans-and-utf-8/ http://ditoinfo.wordpress.com/2007/02/26/netbeans-and-utf8-encoding-2/

thanks a lot

edit:

(mmm I've just found this http://wiki.netbeans.org/FaqI18nProjectEncoding which says haw to modify characters encoding for a project, I'll give it a try here it explains the mess netbeans did

For a new IDE installation, UTF-8 encoding is the default for new projects, as this encoding can handle any Unicode characters, making it the best choice for most people. When you create a new project, the IDE initially defaults to giving it the same encoding as the last project on which you set the encoding. If you want another encoding, just change it in the properties dialog.

and I created a new project from existing php sources, I guess that's what went wrong... )

+1  A: 

I'll take your points in order:

  • If you know a bit of python, I recommend lokking at decodeh.py. It'll use a strategy of the lowest common denominator. So iso-8859-15 files might be recognized as iso-8859-1 if none of their characters lies outside the iso-8859-1 scope. I have tried it on utf-8, iso-8859-1 and iso-8859-15 files and it is mostly correct. It uses the byte order mark and heuristics to guess the encoding.

  • The answer to this is easy: emacs. Use M-xdescribe-current-coding-system to see what emacs thinks is the encoding of the file. Emacs have never failed me in this respect. Use M-x set-buffer-file-coding-system to set which encoding the buffer should be written to a file with.

  • For netbeans 6.7, you can configure the default encoding per project basis. When you have a project open, go to File->Project Properties, choose Sources in the left menu and in the bottom of the right hand panel, you'll see a dropdown box with the title 'Encoding'

Steen
+ If you use a netbean run configuration with remote connection to upload your sources netbeans will detect you modified the encoding and will upload all your sources.
JoeBilly