views:

13

answers:

1

I'm trying to write a utility to migrate code from our custom source control to git.

The 'commit' messages in the old system were added incrementally to a text file in the project path, so I'm using git to diff these files from one version to the next, then using that diff as a commit message.

Unsurprisingly, I'm having trouble with accents.

The test string is: éèàçüûöôëê

  1. The source text file is (so Notepad tells me) in 'ANSI'.
  2. If I do a diff in msysgit, I see this <E9><E8><E0><E7><FC><FB><F6><F4><EB><EA>
  3. When I run a Process call to git with the same diff command (see this question), I get:
    1. ÚÞÓþ³¹÷¶ÙÛ
    2. ���������� with StandardOutputEncoding = Encoding.UTF8, per this question.
    3. éèàçüûöôëê with StandardOutputEncoding = Encoding.Default

So far, so Good, but if I send éèàçüûöôëê back as a commit message, git tells me Warning: commit message does not conform to UTF-8.

Any pointers as to how to get this working? This question talks about setting encoding on StandardInput too, but my commit message is in the arguments, not part of the input.

(Git also says I can set the config variable i18n.commitencoding, but I'd rather respect the git format if I can)

A: 

Looks like it's got nothing to do with .net or Process, or maybe even git. As far as I can tell, there's no way to type accents in bash anyway.

And in spite of the warnings from git, the commit messages are readable by other tools (gitk), so maybe all is well.

Benjol