tags:

views:

572

answers:

5

My GPS logger occassionally leaves "unfinished" lines at the end of the log files. I think they're only at the end, but I want to check all lines just in case.

A sample complete sentence looks like:

$GPRMC,005727.000,A,3751.9418,S,14502.2569,E,0.00,339.17,210808,,,A*76

The line should start with a $ sign, and end with an * and a two character hex checksum. I don't care if the checksum is correct, just that it's present. It also needs to ignore "ADVER" sentences which don't have the checksum and are at the start of every file.

The following Python code might work:

import re
from path import path
nmea = re.compile("^\$.+\*[0-9A-F]{2}$")
for log in path("gpslogs").files("*.log"):
   for line in log.lines():
      if not nmea.match(line) and not "ADVER" in line:
         print "%s\n\t%s\n" % (log, line)

Is there a way to do that with grep or awk or something simple? I haven't really figured out how to get grep to do what I want.

Update: Thanks @Motti and @Paul, I was able to get the following to do almost what I wanted, but had to use single quotes and remove the trailing $ before it would work:

grep -nvE '^\$.*\*[0-9A-F]{2}' *.log | grep -v ADVER | grep -v ADPMB

Two further questions arise, how can I make it ignore blank lines? And can I combine the last two greps?

+3  A: 

The minimum of testing shows that this should do it:

grep -Ev "^\$.*\*[0-9A-Fa-f]{2}$" a.txt | grep -v ADVER
  • -E use extended regexp
  • -v Show lines that do not match
  • ^ starts with
  • .* anything
  • \* an asterisk
  • [0-9A-Fa-f] hexadecimal digit
  • {2} exactly two of the previous
  • $ end of line
  • | grep -v ADVER weed out the ADVER lines

HTH, Motti.

Motti
+1  A: 

@Motti's answer doesn't ignore ADVER lines, but you easily pipe the results of that grep to another:

grep -Ev "^\$.*\*[0-9A-Fa-f]{2}$" a.txt |grep -v ADVER
Paul Tomblin
+1  A: 

@Tom (rephrased) I had to remove the trailing $ for it to work

Removing the $ means that the line may end with something else (e.g. the following will be accepted)

$GPRMC,005727.000,A,3751.9418,S,14502.2569,E,0.00,339.17,210808,,,A*76xxx

@Tom And can I combine the last two greps?

grep -Ev "ADVER|ADPMB"
Motti
A: 

@Motti: Combining the greps isn't working, it's having no effect.

I understand that without the trailing $ something else may folow the checksum & still match, but it didn't work at all with it so I had no choice...

GNU grep 2.5.3 and GNU bash 3.2.39(1) if that makes any difference.

And it looks like the log files are using DOS line-breaks (CR+LF). Does grep need a switch to handle that properly?

Tom
A: 

@Tom

GNU grep 2.5.3 and GNU bash 3.2.39(1) if that makes any difference. And it looks like the log files are using DOS line-breaks (CR+LF). Does grep need a switch to handle that properly?

I'm using grep (GNU grep) 2.4.2 on Windows (for shame!) and it works for me (and DOS line-breaks are naturally accepted) , I don't really have access to other OSs at the moment so I'm sorry but I won't be able to help you any further :o(

Motti