views:

180

answers:

2

I have a file log that I would like to parse and am having some issues. At first it seemed it would be simple. I'll go ahead and post the source I have come up with and then explain what I am trying to do.

The file I'm trying to parse contains this data:

HDD Device 0 : /dev/sda
HDD Model ID  : ST3160815A
HDD Serial No : 5RA020QY
HDD Revision  : 3.AAA
HDD Size     : 152628 MB
Interface    : IDE/ATA
Temperature         : 33 C
Health  : 100%
Performance  : 70%
Power on Time : 27 days, 13 hours
Est. Lifetime : more than 1000 days

HDD Device 1 : /dev/sdb
HDD Model ID  : TOSHIBA MK1237GSX
HDD Serial No : 97LVF9MHS
HDD Revision  : DL130M
HDD Size     : 114473 MB
Interface    : S-ATA
Temperature  : 30 C
Health  : 100%
Performance  : 100%
Power on Time : 38 days, 11 hours
Est. Lifetime : more than 1000 days

My source code (below) basically breaks up the file line by line and then splits the line into two (key:value).

Source:

def dataList = [:]
def theInfoName = "C:\\testdata.txt"

File theInfoFile = new File(theInfoName)

def words
def key
def value

if (!theInfoFile.exists()) {
     println "File does not exist"

} else {

 theInfoFile.eachLine { line ->

 if (line.trim().size() == 0) {
  return null

 } else {

  words = line.split("\t: ")
  key=words[0] 
  value=words[1]
  dataList[key]=value

  println "${words[0]}=${words[1]}"
  }

 }
 println "$dataList.Performance"  //test if Performance has over-written the previous Performance value
}

The problem with my source is that when I use my getters (such as $dataList.Performance) it only shows the last one in the file rather than two.

So I'm wondering, how do I parse the file so that it keeps the information for both hard drives? Is there a way to pack the info into a 'hard drive object'?

Any and all help is appreciated

A few side notes:

The file is on a windows machine (even though the info is grabbed from a nix system)

The text file is split by a tab, colon, and space (like shown in my source code) just thought I would state that because it doesn't look like that on this page.

+2  A: 

This will read the data in blocks (with blank lines separating the blocks)

def dataList = []
def theInfoName = 'testdata.txt'

File theInfoFile = new File( theInfoName )

if( !theInfoFile.exists() ) {
  println "File does not exist"
} else {
  def driveInfo = [:]
  // Step through each line in the file
  theInfoFile.eachLine { line ->
    // If the line isn't blank
    if( line.trim() ) {
      // Split into a key and value
      def (key,value) = line.split( '\t: ' ).collect { it.trim() }
      // and store them in the driveInfo Map
      driveInfo."$key" = value
    }
    else {
      // If the line is blank, and we have some info
      if( driveInfo ) {
        // store it in the list
        dataList << driveInfo
        // and clear it
        driveInfo = [:]
      }
    }
  }
  // when we've finished the file, store any remaining data
  if( driveInfo ) {
    dataList << driveInfo
  }
}

dataList.eachWithIndex { it, index ->
  println "Drive $index"
  it.each { k, v ->
    println "\t$k = $v"
  }
}

Fingers crossed you have blank lines between your HDD info sections (you showed one in your test data) :-)

btw: I get the following output:

Drive 0
    HDD Device 0 = /dev/sda
    HDD Model ID = ST3160815A
    HDD Serial No = 5RA020QY
    HDD Revision = 3.AAA
    HDD Size = 152628 MB
    Interface = IDE/ATA
    Temperature = 33 C
    Health = 100%
    Performance = 70%
    Power on Time = 27 days, 13 hours
    Est. Lifetime = more than 1000 days
Drive 1
    HDD Device 1 = /dev/sdb
    HDD Model ID = TOSHIBA MK1237GSX
    HDD Serial No = 97LVF9MHS
    HDD Revision = DL130M
    HDD Size = 114473 MB
    Interface = S-ATA
    Temperature = 30 C
    Health = 100%
    Performance = 100%
    Power on Time = 38 days, 11 hours
    Est. Lifetime = more than 1000 days

Messing around, I also got the code down to:

def dataList = []
def theInfoFile = new File( 'testdata.txt' )

if( !theInfoFile.exists() ) {
  println "File does not exist"
} else {
  // Split the text of the file into blocks separated by \n\n
  // Then, starting with an empty list go through each block of text in turn
  dataList = theInfoFile.text.split( '\n\n' ).inject( [] ) { list, block ->
    // Split the current block into lines (based on the newline char)
    // Then starting with an empty map, go through each line in turn
    // when done, add this map to the list we created in the line above
    list << block.split( '\n' ).inject( [:] ) { map, line ->
      // Split the line up into a key and a value (trimming each element)
      def (key,value) = line.split( '\t: ' ).collect { it.trim() }
      // Then, add this key:value mapping to the map we created 2 lines above
      map << [ (key): value ] // The leftShift operator also returns the map 
                              // the inject closure has to return the accumulated
                              // state each time the closure is called
    }
  }
}

dataList.eachWithIndex { it, index ->
  println "Drive $index"
  it.each { k, v ->
    println "\t$k = $v"
  }
}

But that has to load the whole file into memory at once (and relies on \n as the EOL termination char)

tim_yates
Ahh, the power of inject. ;)
Blacktiger
Everyone loves inject ;-)
tim_yates
Wow, thanks bud. I don't mean to bother you, but are you able to comment the second one, like you did with the first? Or if that's too much work, perhaps explain how it works. Thanks again, tested it out and works quite nicely. As far as loading it into memory, it should be fine since it's not a large amount of text.
JohnStamos
I've added some comments :-)The main thing is the inject method (called reduce or fold in other languages). It's explained quite well here: http://mrhaki.blogspot.com/2009/09/groovy-goodness-using-inject-method.html
tim_yates
Thanks for all the help Tim. You don't happen to have msn messenger or something similar do you? I had a quick question for you.
JohnStamos
no, sorry :-( I'm only on http://twitter.com/tim_yates or email
tim_yates
I don't have a twitter :/ Basically I added a header to my text file and my attempt to read over the header has failed. As a side note, I have been thinking about signing up for twitter. I haven't looked too much into it. I'll check out your twitter and if i get one, I'll be sure to add it.
JohnStamos
Not sure if you're interested, but I posted the question here: http://stackoverflow.com/questions/3456318/groovy-read-text-file-but-omit-header
JohnStamos
+1  A: 

Here is my solution:

File file = new File('testdata.txt')
if(file.exists()) {
    def drives = [[:]]
    // Split each line using whitespace:whitespace as the delimeter.
    file.splitEachLine(/\s:\s/) { items ->
        // Lines that did not have the delimeter will have 1 item.
        // Add a new map to the end of the drives list.
        if(items.size() == 1 && drives[-1] != [:]) drives << [:]
        else {
            // Multiple assignment, items[0] => key and items[1] => value
            def (key, value) = items
            drives[-1][key] = value
        }
    }

    drives.eachWithIndex { drive, index ->
        println "Drive $index"
        drive.each {key, value ->
            println "\t$key: $value"
        }
    }
}
Blacktiger
This one works quite nicely too! Thanks bud. Would you mind commenting your code? Now that you two have posted working examples I would like to know how everything is working before I use it :]
JohnStamos