views:

183

answers:

7

I have a string value read in from a CSV file. The CSV file contains 7 NULL bytes, I have confirmed this by opening it in a hex editor and sure enought there are 7 0x0 bytes in there. This string is causing me pain.

In vb.net when I check the strlen of this string it returns a value of 7 and if i do a String.IsNullOrWhitespace it returns false.

I cannot understand why this is? I have split the string into a byte array and each byte is 0x0, which is null/nothing. A string = Nothing comparison also fails.

I want to be able to replace this string with a string of my own but I cannot do this dynamically. Any suggestions why this string returns a length of 7 even though each byte is 0x0?

A: 

Not sure which version of VB.NET you're using but .NET 2.0 has String.IsNullOrEmpty(yourstring). What do you get when using this method?

Raj
He will get False, as the string is neither null nor empty.
Guffa
-1 Not an answer.
C. Ross
+2  A: 

The Null character is not whitespace, and your string reference is not Nothing, so I would expect String.IsNullOrWhitespace() to return false

Rowland Shaw
So if that is the case how do I check for a string of NULL chars?
WizardsSleeve
You'd need to either convert them to "something else", for example using `yourstring = yourstring.Replace( Chr(0), " "c )` to convert to spaces or iterate over the string and check yourself
Rowland Shaw
A: 

IsNullEmptyOrWhitespace checks if the variable itself is null, not if the string contains NULL characters. A NULL character is not a whitespace. So this check also fails.

I suggest you use a Trim(), after the test. In C# this will look like:

bool MyNullCheck(string s) {
    if (s == null) return false;
    s = s.Trim(new string(char.ConvertFromUtf32(0), 1));
    return string.IsNullEmptyOrWhiteSpace(s);
}

Try to convert to VB (not checked)

Function MyNullCheck(s as String) as Boolean
  If s Is Nothing Then
     Return False
  End If
  s = s.Trim(New String(vbNullChar, 1))
  Return String.IsNullEmptyOrWhiteSpace(s)
End Function
GvS
+1  A: 

Unfortunately the null character seven times is not an empty string, or a null string. Remember in .NET a string is at some level a pointer to a character array. A string is null if this pointer is set to null. A string is empty if the pointer points to a zero length array. In this case the pointer points to a length seven array of null characters (the byte being all zeros).

Null String

A ->

Empty String

A -> ()

Your String

A -> ((0)(0)(0)(0)(0)(0)(0))

You can test for this null character by using

char nullChar = char.ConvertFromUtf32(0);
string nullCharString = new String(nullChar);
bool hasNullChar = A.Contains(nullCharString);
C. Ross
+1  A: 

A character with the character code zero is a character just like any other. If you have a string with seven such characters, the length is seven. The NUL character is not a white-space character, and a string containing NUL characters is not the same as a string reference that is null (Nothing).

You could use the Trim method (or TrimEnd) to remove the NUL characters by specifying that it should trim NUL characters: str = str.Trim(Chr(0)), but I think that you should rather ask yourself why there is NUL characters in the string to start with.

Are you reading the data properly from the file? A common error is to use the Read method to read from a stream, but ignoring it's return value and thus ending up with a buffer only partly filled with data from the stream. As a byte array is filled with zeroes when you create it, the bytes not set by the Read operation would remain zero and become NUL characters when you decode the data into a string.

Guffa
The string is coming from another programs output over which I have no control. I am reading it in correctly as I have verified the contents of the file using a hex editor.
WizardsSleeve
@WizardsSleeve: Ok, if you are sure that you can safely ignore the NUL characters, then you can just trim them from the string. If the string contains only NUL characters, you end up with an empty string after that.
Guffa
A: 
  • A null string is one that hasn't been initialised or has been set to Nothing.
  • An empty string is one that contains the empty string String.Empty or "".
  • Whitespace characters are space, tab, newline, carriage return and lots more. But not the null character.
  • Your string is neither empty nor Nothing. It contains 7 characters, each one is the null character - so it is not whitespace.

You could use String.Replace to remove the zero characters? Something like this

s = s.Replace(vbNullChar, "")
MarkJ
A: 

I bet you have run into an encoding issue. Try reading the file as UTF-16

erikkallen