views:

1065

answers:

4

Hi all,

I have a TXT file that i need to import via an application, but for some reason i need to open it in wordpad first and then save it before importing it. I'm guessing it has to do with Line Breaks. Cause if i open it in notepad first, there are no line breaks, but if i open it with wordpad the lines are seperated.

Does anyone know why this occurs and how i can avoid having to manually open a file and save it with wordpad? The app is written in vb 6 (Yikes!).

Thanks for any help

+2  A: 

This is a line ending problem. Your code (and notepad) want to see Carriage Return(CR)/Line Feed(LF) pairs, and this is probably CR only (Macintosh) or LF only (Unix) file. Wordpad is more forgiving, and upon save is apparently (haven't tested it) saving CR/LF pairs for you.

You can change your code in the application to look for any of the endings and treat them easily: just stop looking for vbCrLf as a pair and look for either as the end of line. My own strategy is to scan for CR or LF and the consume all CR/LF characters that followed: this clears blank lines as well.

Godeke
A: 

The file probably has only a Carriage Return (CR) or a Line Feed (LF) character at the end of each line.

In Windows, you need both a CR and LF character at the end of each line. This can easily be done in VB6 by using the constant vbCRLF.

On the flip side, if you are the one reading the file, you can determine which one is missing and manually add it in as you read the file (ie, using the replace function to convert CR into CRLF or LF into CRLF).

Daemonic
Or accept anything by replacing CRLF to LF, then replace CR to LF, and then LF to CRLF. Or something like that.
MarkJ
A: 

Are you getting this file from a Unix system? It might help to read the Wikipedia article on Newline.

You could also have a look for various unix2dos programs available online. They will handle the conversion for you. There may even be a VB DLL.

edoloughlin
A: 

Unless these files are very large and performance is critical, reading them by line can be accomplished easily via the ADODB.Stream object.

Not only will this handle several line delimiters (Stream.LineSeparator = adCR, adCRLF, or adLF) it can also be used to process files containing Unicode (UTF-16), UTF-8, system codepage ANSI, and alternative "ANSI" encodings for other locales.

For example if you have a text file that contains "ANSI" from a Russian language locale you can set Stream.Charset = "koi8-r" and read the data with proper translation into VB6 Unicode (UTF-16):

Dim Stm As ADODB.Stream
Dim Line As String
Dim Counter As Long
Set Stm = New ADODB.Stream
With Stm
    .Open
    .LoadFromFile "russian.txt"
    .Type = adTypeText
    .Charset = "koi8-r"
    .LineSeparator = adLF
    Do Until .EOS
        Line = .ReadText(adReadLine) 'Text is in Unicode now.
        Counter = Counter + 1
    Loop
    .Close
End With

Charset defaults to the value "unicode" (UTF-16) but to read or write the Stream in ANSI with the default codepage you can set it to "ascii" instead.

HKCR\MIME\Database\Charset contains the available values.

Bob