views:

207

answers:

3

Hi, I have a problem reading IniFiles with different encodings. If I read a Unicode file, GetPrivateProfileSectionNamesA seems to stumble over the fist line. ASCII or ANSI works fine. I wrote a little program to illustrate my problem. First the output, then the program. I do not realy care about UTF7 and UTF32, but what I don't get is the UTF8 part. Do I have to use a different funtion to read Unicode IniFiles? Do I do something wrong? Hope somebody can help me, thanks Norbert

what I get:

IniEntriesWithSectionInFirstLine
first section using System.Text.ASCIIEncoding is FirstSectionInFirstLine
first section using System.Text.Latin1Encoding is FirstSectionInFirstLine
first section using System.Text.UTF7Encoding is
first section using System.Text.UTF8Encoding is SecondSection
first section using System.Text.UTF32Encoding is SecondSectio????????????

IniEntriesWithFirstLineEmpty
first section using System.Text.ASCIIEncoding is FirstSectionInSecondLine
first section using System.Text.Latin1Encoding is FirstSectionInSecondLine
first section using System.Text.UTF7Encoding is
first section using System.Text.UTF8Encoding is FirstSectionInSecondLine
first section using System.Text.UTF32Encoding is FirstSectionInSecondLin????????

the program:

using System;
using System.Collections.Generic;
using System.IO;
using System.Runtime.InteropServices;
using System.Text;

namespace TestIniRead
{
    internal class Program
    {
     [DllImport("kernel32.dll", EntryPoint = "GetPrivateProfileSectionNamesA")]
     private static extern int GetSectionNamesListA(
      byte[] lpszReturnBuffer,
      int nSize,
      string lpFileName);

     private static readonly string[] IniEntriesWithSectionInFirstLine = {
                                                      "[FirstSectionInFirstLine]",
                                                      "value=firsValue",
                                                      "",
                                                      "[SecondSection]",
                                                      "value=secondValue",
                                                      "",
                                                      "[ThirdSection]",
                                                      "value=secondValue",
                                                      ""
                                                     };
     private static readonly string[] IniEntriesWithFirstLineEmpty = {
                                                      "",
                                                      "[FirstSectionInSecondLine]",
                                                      "value=firsValue",
                                                      "",
                                                      "[SecondSection]",
                                                      "value=secondValue",
                                                      "",
                                                      "[ThirdSection]",
                                                      "value=secondValue",
                                                      ""
                                                     };

     private static void Main()
     {
      var fileInfo = new FileInfo("test.ini");
      Console.WriteLine("IniEntriesWithSectionInFirstLine");
      TestEncodings(fileInfo, IniEntriesWithSectionInFirstLine);
      Console.WriteLine("");
      Console.WriteLine("IniEntriesWithFirstLineEmpty");
      TestEncodings(fileInfo, IniEntriesWithFirstLineEmpty);
      Console.ReadLine();
     }

     private static void TestEncodings(FileInfo fileInfo, IEnumerable<string> iniEntries)
     {
      TestEncoding(fileInfo, iniEntries, Encoding.ASCII);
      TestEncoding(fileInfo, iniEntries, Encoding.GetEncoding("ISO-8859-1"));
      TestEncoding(fileInfo, iniEntries, Encoding.UTF7);
      TestEncoding(fileInfo, iniEntries, Encoding.UTF8);
      TestEncoding(fileInfo, iniEntries, Encoding.UTF32);
     }

     private static void TestEncoding(FileInfo fileInfo, IEnumerable<string> iniEntries, Encoding encoding)
     {
      CreateIniFile(fileInfo, iniEntries, encoding);
      if (fileInfo.Exists)
      {
       var buffer = new byte[fileInfo.Length];
       GetSectionNamesListA(buffer, (int) fileInfo.Length, fileInfo.FullName);
       String s = encoding.GetString(buffer);
       String[] names = s.Split('\0');

       Console.WriteLine("first section using {0} is {1}", encoding, names[0]);
      }
     }

     private static void CreateIniFile(FileSystemInfo fileInfo, IEnumerable<string> iniEntries, Encoding encoding)
     {
      using (var sw = new StreamWriter(File.Open(fileInfo.FullName, FileMode.Create), encoding))
      {
       foreach (string line in iniEntries)
       {
        sw.WriteLine(line);
       }
      }
     }
    }
}

Reaction to the first three answers:

You are of course right. I should use GetPrivateProfileSectionNamesW for Unicode files. I included a method to get the encoding of the IniFile and used A or W accordingly. The Problem stays the same. The function does not get the first section. Below see new code only for UTF8.

what I get:

IniEntriesWithSectionInFirstLine
first section using System.Text.UTF8Encoding is SecondSection

the program:

using System;                                                                                                         
using System.Collections.Generic;
using System.IO;
using System.Runtime.InteropServices;
using System.Text;

namespace TestIniRead
{
 internal class Program
 {
  [DllImport("kernel32.dll", EntryPoint = "GetPrivateProfileSectionNamesA")]
  private static extern int GetSectionNamesListA(
    byte[] lpszReturnBuffer,
    int nSize,
    string lpFileName);

  [DllImport("kernel32", EntryPoint = "GetPrivateProfileSectionNamesW", CharSet = CharSet.Unicode)]
  private static extern int GetSectionNames
   (
   [MarshalAs(UnmanagedType.LPWStr)] string szBuffer,
   int nlen,
   string filename
   );

  private static readonly string[] IniEntriesWithSectionInFirstLine = {
                "[FirstSectionInFirstLine]",
                "value=firsValue",
                "",
                "[SecondSection]",
                "value=secondValue",
                "",
                "[ThirdSection]",
                "value=secondValue",
                ""
              };

  private static void Main()
  {
   var fileInfo = new FileInfo("test.ini");
   Console.WriteLine("IniEntriesWithSectionInFirstLine");
   TestEncodings(fileInfo, IniEntriesWithSectionInFirstLine);
   Console.WriteLine("");
   Console.ReadLine();
  }

  private static void TestEncodings(FileInfo fileInfo, IEnumerable<string> iniEntries)
  {
   TestEncoding(fileInfo, iniEntries, Encoding.UTF8);
  }

  private static readonly char[] separator = { '\0' };

  private static void TestEncoding(FileInfo fileInfo, IEnumerable<string> iniEntries, Encoding encoding)
  {
   CreateIniFile(fileInfo, iniEntries, encoding);
   if (fileInfo.Exists)
   {
    int len = (int)fileInfo.Length;
    var buffer = new string('\0', len);
    int nlen = GetSectionNames(buffer, len, fileInfo.FullName);
    if (nlen <= 0)
    {
     Environment.Exit(nlen);
    }

    String[] names = buffer.Substring(0, nlen).Split(separator);
    Console.WriteLine("first section using {0} is {1}", encoding, names[0]);
   }
  }

  private static void CreateIniFile
   (
   FileSystemInfo fileInfo, 
   IEnumerable<string> iniEntries, 
   Encoding encoding)
  {
   using (var sw = new StreamWriter(File.Open(fileInfo.FullName, FileMode.Create), encoding))
   {
    foreach (string line in iniEntries)
    {
     sw.WriteLine(line);
    }
   }
  }
 }
}
+1  A: 

The first few bytes of a unicode file can contain the byte order marks. Whatever text editor you are using is saving the unicode file and including byte order marks. These are then confusing the API function.

Have you tried calling GetPrivateProfileSectionNamesW instead? (The A indicates the ANSI version of an API funciton, the W for wide indicates a Unicode version)

Or you could just set your text editor to save the file without byte order marks.

pipTheGeek
You are of course right. I should use GetPrivateProfileSectionNamesW for Unicode files. I included a method to get the encoding of the IniFile and used A or W accordingly. The Problem stays the same. The function does not get the first section. See changes above.
NorbertKl
I still suspect that the API function is having problems processing the byte order marks at the start of the file. Try getting your text editor not to include any. You can check if they are present by opening the file in a hex editor. (Textpad can open text files in hex view and allows you to control what if any byte order marks are included)
pipTheGeek
A: 
  1. Have you tried GetPrivateProfileSectionNamesW?
  2. Can you just make sure the ini file is store in ASCII? From the MSDN documentation:

    Note This function is provided only for compatibility with 16-bit Windows-based applications.

  3. The .NET settings files are vastly superior to the INI files. If you aren't writing something to interoperate with legacy systems, I highly recommend using the new way.

280Z28
You are of course right. I should use GetPrivateProfileSectionNamesW for Unicode files. I included a method to get the encoding of the IniFile and used A or W accordingly. The Problem stays the same. The function does not get the first section. I have to read IniFiles, could not change to XML. See changes above.
NorbertKl
A: 

I have actually seen the same thing, but without doing the testing you have (I just made sure to have an empty line at the beginning of the ini file).

I was originally writing the inifile by using the IO functions in .NET framework, and when another program written in oldfashion C++ was reading it, the first line was missing. I ended up changing my .NET code to use the ISO-8859-1 encoding, which is probably the closest to how basic text file writing was before unicode came along... The default encoding in .NET is UTF8. In many cases, Encodings.ASCII would probaly be OK, but that includes just the first 127 characters.

In most cases, I think the Encodings.Default would be good to use, because this represents the default codepage used on the runnning windows instance, which in my case (and probably in your case) maps to the ISO-8859-1 encoding. In other parts of the world, it will map to other subsets of the ISO-8859 standard.

awe
I actually convert the IniFiles to ISO-8859-1. But I think I should not have to do that.
NorbertKl