tags:

views:

106

answers:

1

Hi, I don't know C++ very well especially the IO part. Can anyone please help me to translate the following C++ code into C#?

unsigned *PostingOffset, *PostingLength, NumTerms;

void LoadSubIndex(char *subindex) {
  FILE *in = fopen(subindex, "rb");
  if (in == 0) {
    printf("Error opening sub-index file '%s'!\n", subindex);
    exit(EXIT_FAILURE);
  }
  int len=0;
  // Array of terms
  char **Term;
  char *TermList;
  fread(&NumTerms, sizeof(unsigned), 1, in);
  PostingOffset = (unsigned*)malloc(sizeof(unsigned) * NumTerms);
  PostingLength = (unsigned*)malloc(sizeof(unsigned) * NumTerms);
  Term = (char**)malloc(sizeof(char*) * NumTerms);
  Term = (char**)malloc(sizeof(char*) * NumTerms);
  // Offset of each posting
  fread(PostingOffset, sizeof(unsigned), NumTerms, in);
  // Length of each posting in bytes
  fread(PostingLength, sizeof(unsigned), NumTerms, in);
  // Number of bytes in the posting terms array
  fread(&len, sizeof(unsigned), 1, in); 
  TermList = (char*)malloc(sizeof(char) * len);
  fread(TermList, sizeof(unsigned)*len, 1, in);

  unsigned k=1;
  Term[0] = &TermList[0];
  for (int i=1; i<len; i++) {
    if (TermList[i-1] == '\0') {
      Term[k] = &TermList[i];
      k++;
    }
  }
  fclose(in);
}

Thanks in advance.

+6  A: 

I'll give you a headstart.

using(var reader = new BinaryReader(new FileStream(subindex, FileMode.Open)) {
    int numTerms = reader.ReadUInt32();
    postingOffset = new UInt32[numTerms];
    postingLength = new UInt32[numTerms];
    var term = new byte[numTerms];
    for(int i=0;i<numTerms;i++)
        postingOffset[i] = reader.ReadUInt32();
    for(int i=0;i<numTerms;i++)
        postingLength[i] = reader.ReadUInt32();
    var len = reader.ReadInt32();
    var termList = new ... // byte[] or uint32[] ??
    //etc
}

There's no need to close the file handle here - it will close when the using { } block loses scope.

I didn't finish it because there are some flaws in your code. With TermList you are reading in 4 times as much data as you've allocated. You shouldn't be allocating Term twice either - that will result in leaking memory.

To turn Term back into a string, use Encoding.ASCII.GetString(term).TrimEnd('\0');

Mark H
Who said the integer in the file were 32 bits long?
Martin York
@Martin: As Mark said, it's a headstart.
Oliver