views:

151

answers:

2

Hi,

I was experimenting with basic VB.Net File IO and String splitting. I encountered this problem. I don't know whether it has something to do with the File IO or String splitting.

I am writing text to a file like so

Dim sWriter As New StreamWriter("Data.txt")
sWriter.WriteLine("FirstItem")
sWriter.WriteLine("SecondItem")
sWriter.WriteLine("ThirdItem")
sWriter.Close()

Then, I am reading the text from the file

Dim sReader As New StreamReader("Data.txt")
Dim fileContents As String = sReader.ReadToEnd()
sReader.Close()

Now, I am splitting fileContents using Environment.NewLine as the delimiter.

Dim tempStr() As String = fileContents.Split(Environment.NewLine)

When I print the resulting Array, I get some weird results

For Each str As String In tempStr
  Console.WriteLine("*" + str + "*")
Next

I added the *s to the beginning and end of the Array items during printing, to find out what is going on. Since NewLine is used as the delimiter, I expected the strings in the Array to NOT have any NewLine's. But the output was this -

*FirstItem*
*
SecondItem*
*
ThirdItem*
*
*

Shouldn't it be this -

*FirstItem*
*SecondItem*
*ThirdItem*
**

??

Why is there a new line in the beginning of all but the first string?

Update: I did a character by character print of fileContents and got this -

F - 70
i - 105
r - 114
s - 115
t - 116
I - 73
t - 116
e - 101
m - 109
 - 13

 - 10
S - 83
e - 101
c - 99
o - 111
n - 110
d - 100
I - 73
t - 116
e - 101
m - 109
 - 13

 - 10
T - 84
h - 104
i - 105
r - 114
d - 100
I - 73
t - 116
e - 101
m - 109
 - 13

 - 10

It seems 'Environment.NewLine' consists of

 - 13

 - 10

13 and 10.. I understand. But the empty space in between? I don't know whether it is coming due to printing to the console or is really a part of NewLine.

So, when splitting, only the character equivalent of ASCII value 13, which is the first character of NewLine, is used as delimiter (as explained in the replies) and the remaining stuff is still present in the strings. For some reason, the mysterious empty space in the list above and ASCII value 10 together result in a new line being printed.

Now it is clear. Thanks for the help. :)

+3  A: 

First of all, yes, WriteLine tacks on a newline to the end of the string, hence the blank line at the end.

The problem is the way you're calling fileContents.Split(). The only version of that function that takes only one argument takes a char(), not a string. Environment.NewLine is a string, not a char, so (assuming you have Option Strict Off) when you're calling the function it's implicitly converting it to a char, using only the first character in the string. This means that instead of splitting your string on the actual sequence of two characters that make up Environment.NewLine, it's actually splitting only on the first of those characters.

To get your desired output, you need to call it like this:

Dim delims() as String = { Environment.NewLine }
Dim tempStr() As String = fileContents.Split(delims, _
                          StringSplitOptions.RemoveEmptyEntries)

This will cause it to split on the actual string, rather than the first character as it's doing now, and it will remove any blank entries from the results.

Adam Robinson
By the way thats not completely right, it just uses the first char of the string, leaving all \n chars in the splitted strings.
Philip Daubmeier
Or just go old school and use Split(fileContents, vbNewLine)
Chris Haas
@Chris: Which will leave a useless `CR` character at the end of all the strings.
Adam Robinson
@Chris, @Adam. Chris is right. The `vbNewLine` constant is `CRLF` and the VB `Split` function allows multi-character delimeters, so Chris's suggestion will remove the `CR` and the `LF`. http://msdn.microsoft.com/en-us/library/6x627e5f.aspx
MarkJ
@MarkJ: I see, Chris was recommending the use of the VB-specific split function. Using language-specific types and functions (such as `Split` and `vbNewLine`, for example) is generally discouraged. They're only in the language to make the job of porting legacy code easier; new development should take advantage of standard BCL types and functions.
Adam Robinson
@Adam: good point. I avoid using the vbXXXXX constants and always look for .Net alternatives.
Senthil
+3  A: 

Why not just use File.ReadAllLines? One single call reads the file and returns a string array with the lines.

Dim tempStr() As String = File.ReadAllLines("data.txt")
MarkJ
Hi, Yes.. I used that in the beginning. But when the problem occurred, I thought I was doing the File IO wrong. I wanted to make sure the problem was not the way I am reading and writing to files, so I switched to StreamReader and StreamWriter (things I am familiar with from earlier coding experiences). I forgot to switch back. Thanks for the nice suggestion anyway :)
Senthil