views:

92

answers:

3

hi guys,

I'm a Java developer and I'm using Ubuntu to develop. The project was created in Windows with Eclipse and it's using the CP1252 encoding.

To convert to UTF-8 I've used the recode program:

find Web -iname \*.java | xargs recode CP1252...UTF-8

this command gives this error:

recode: Web/src/br/cits/projeto/geral/presentation/GravacaoMessageHelper.java failed: Ambiguous output in step `CR-LF..data

I've serached about it and get the solution here: http://fvue.nl/wiki/Bash_and_Windows#Recode:_Ambiguous_output_in_step_.60data..CR-LF.27 and it says:

Convert line endings from CR/LF to a single LF: Edit the file with vim , give the command :set ff=unix and save the file. Recode now should run without errors.

Nice but I've many files to remove the CR/LF character, I can't open each to do it. Vi doesn't provide any option to command line for bash operations.

sed can be use to do this ? How ?

Thankx =)

+1  A: 

There should be a program called dos2unix or fromdos that will fix line endings for you. If it's not already on your Linux box, it should be available via the package manager.

cHao
i've instaled tofrodos that provide fromdos command, but the problem persist. fromdos -a GravacaoMessageHelper.java; recode CP1252...UTF-8 GravacaoMessageHelper.java returns: recode: GravacaoMessageHelper.java failed: Ambiguous output in step `CR-LF..data'
MaikoID
+1 for mentioning dos2unix.
Bernard
@MaikoID: Then you have bigger problems. recode shouldn't care about line endings anyway, as a CR is just another character to convert. And it doesn't seem to care on my machine.
cHao
A: 

The tr command can also do this:

tr -d '\15\32' < winfile.txt > unixfile.txt

and should be available to you.

You'll need to run tr from within a script, since it cannot work with file names. For example, create a file myscript.sh:

#!/bin/bash

cd ${1}
for f in `find -iname \*.java`; do
    echo $f
    tr -d '\15\32' < $f > $f.tr
    mv $f.tr $f
    recode CP1252...UTF-8 $f
done

Running myscript.sh Web would process all the java files in folder Web.

KeithL
how can I adapt to find Web -iname \*.java | xargs recode CP1252...UTF-8
MaikoID
You would need to run tr within a bash script, since it can't work on file names. I'll edit my answer with a sample script.
KeithL
Thnx for the answer but the error persists =| Ambiguous output in step `CR-LF..data'
MaikoID
A: 

Go back to Windows, tell Eclipse to change the encoding to UTF-8, then back to Unix and run d2u on the files.

Jonathan
Although if there's a lot of files, this may be more work than you're willing to put into it...
Jonathan