views:

900

answers:

2

I'm looking for the best way of loading formatted data in VBA. I’ve spent quite some time trying to find the equivalent of C-like or Fortran-like fscanf type functions, but without success.

Basically I want to read from a text file millions of numbers placed on many (100,000’s) lines with 10 numbers each (except the last line, possibly 1-10 numbers). The numbers are separated by spaces, but I don’t know in advance the width of each field (and this width changes between data blocks). e.g.

  397143.1   396743.1   396343.1   395943.1   395543.1   395143.1   394743.1   394343.1   393943.1   393543.1

   -0.11    -0.10    -0.10    -0.10    -0.10    -0.09    -0.09    -0.09    -0.09    -0.09

 0.171  0.165  0.164  0.162  0.158  0.154  0.151  0.145  0.157  0.209

Previously I’ve used the “Mid” function but in this case I can’t, because I don’t know in advance the width of each field. Also it's too many lines to load in an Excel sheet. I can think of a brute force way in which I look at each successive character and determine whether it’s a space or a number, but it seems terribly clumsy.

I’m also interested in pointers on how to write formatted data, but this seems easier -- just format each string and concatenate them using &.

Cheers

+1  A: 

The following snippet will read whitespace-delimited numbers from a text file:

Dim someNumber As Double

Open "YourDataFile.txt" For Input As #1

Do While Not (EOF(1))
    Input #1, someNumber
    `// do something with someNumber here...`
Loop

Close #1

update: Here is how you could read one line at a time, with a variable number of items on each line:

Dim someNumber As Double
Dim startPosition As Long
Dim endPosition As Long
Dim temp As String

Open "YourDataFile" For Input As #1

Do While Not (EOF(1))
    startPosition = Seek(1)  '// capture the current file position'
    Line Input #1, temp      '// read an entire line'
    endPosition = Seek(1)    '// determine the end-of-line file position'
    Seek 1, startPosition    '// jump back to the beginning of the line'

    '// read numbers from the file until the end of the current line'
    Do While Not (EOF(1)) And (Seek(1) < endPosition)
        Input #1, someNumber
        '// do something with someNumber here...'
    Loop

Loop

Close #1
e.James
Super! I'll actually use a combination of those two methods.
JF
Glad to be of help :)
e.James
Over a year later and it's still useful :)
PowerUser
+1  A: 

You could also use regular expressions to replace multiple whitespaces to one space and then use the Split function for each line like the example code shows below.

After 65000 rows have been processed a new sheet will be added to the Excel workbook so the source file can be bigger than the max number of rows in Excel.

Dim rx As RegExp

Sub Start()

    Dim fso As FileSystemObject
    Dim stream As TextStream
    Dim originalLine As String
    Dim formattedLine As String
    Dim rowNr As Long
    Dim sht As Worksheet
    Dim shtCount As Long

    Const maxRows As Long = 65000

    Set fso = New FileSystemObject
    Set stream = fso.OpenTextFile("c:\data.txt", ForReading)

    rowNr = 1
    shtCount = 1

    Set sht = Worksheets.Add
    sht.Name = shtCount

    Do While Not stream.AtEndOfStream
        originalLine = stream.ReadLine
        formattedLine = ReformatLine(originalLine)
        If formattedLine <> "" Then
            WriteValues formattedLine, rowNr, sht
            rowNr = rowNr + 1
            If rowNr > maxRows Then
                rowNr = 1
                shtCount = shtCount + 1
                Set sht = Worksheets.Add
                sht.Name = shtCount
            End If
        End If
    Loop

End Sub


Function ReformatLine(line As String) As String

    Set rx = New RegExp

    With rx
        .MultiLine = False
        .Global = True
        .IgnoreCase = True
        .Pattern = "[\s]+"
        ReformatLine = .Replace(line, " ")
    End With

End Function


Function WriteValues(formattedLine As String, rowNr As Long, sht As Worksheet)

    Dim colNr As Long
    colNr = 1

    stringArray = Split(formattedLine, " ")
    For Each stringItem In stringArray
        sht.Cells(rowNr, colNr) = stringItem
        colNr = colNr + 1
    Next

End Function
Marc
For this particular task I don't need to load the numbers into actual Excel sheets, and it would be a bit cumbersome to deal with tens of sheets... But, I'll make sure to make note of the method you suggest, as I am sure I will need something of the sort for other tasks. Thanks!
JF
Ah I thought you wanted to capture the values in Excel for further processing/charting. What do you do with the values now?
Marc