tags:

views:

168

answers:

2

Vims errorformat (for parsing compile/build errors) uses an arcane format from c for parsing errors.

Trying to set up an errorformat for nant seems almost impossible, I've tried for many hours and can't get it. I also see from my searches that alot of people seem to be having the same problem. A regex to solve this would take minutesto write.

So why does vim still use this format? It's quite possible that the C parser is faster but that hardly seems relevant for something that happens once every few minutes at most. Is there a good reason or is it just an historical artifact?

A: 

lol try looking at the actual vim source code sometime. It's a nest of C code so old and obscure you'll think you're on an archaeological dig.

As for why vim uses the C parser, there are plenty of good reasons starting with that it's pretty universal. But the real reason is that sometime in the past 20 years someone wrote it to use the C parser and it works. No one changes what works.

If it doesn't work for you the vim community will tell you to write your own. Stupid open source bastards.

Whaledawg
You might want to tone it down a bit - if nothing else, it would save us moderators a bit of work ;-p
Marc Gravell
+4  A: 

It's not that Vim uses an arcane format from C. Rather it uses the ideas from scanf, which is a C function. This means that the string that matches the error message is made up of 3 parts:

  • whitespace
  • characters
  • conversion specifications

Whitespace is your tabs and spaces. Characters are the letters, numbers and other normal stuff. Conversion specifications are sequences that start with a '%' (percent) character. In scanf you would typically match an input string against %d or %f to convert to integers or floats. With Vim's error format, you are searching the input string (error message) for files, lines and other compiler specific information.

If you were using scanf to extract an integer from the string "99 bottles of beer", then you would use:

int i;
scanf("%d bottles of beer", &i); // i would be 99, string read from stdin

Now with Vim's error format it gets a bit trickier but it does try to match more complex patterns easily. Things like multiline error messages, file names, changing directory, etc, etc. One of the examples in the help for errorformat is useful:

1  Error 275
2  line 42
3  column 3
4  ' ' expected after '--'

The appropriate error format string has to look like this:

  :set efm=%EError\ %n,%Cline\ %l,%Ccolumn\ %c,%Z%m

Here %E tells Vim that it is the start of a multi-line error message. %n is an error number. %C is the continuation of a multi-line message, with %l being the line number, and %c the column number. %Z marks the end of the multiline message and %m matches the error message that would be shown in the status line. You need to escape spaces with backslashes, which adds a bit of extra weirdness.

While it might initially seem easier with a regex, this mini-language is specifically designed to help with matching compiler errors. It has a lot of shortcuts in there. I mean you don't have to think about things like matching multiple lines, multiple digits, matching path names (just use %f).

Another thought: How would you map numbers to mean line numbers, or strings to mean files or error messages if you were to use just a normal regexp? By group position? That might work, but it wouldn't be very flexible. Another way would be named capture groups, but then this syntax looks a lot like a short hand for that anyway. You can actually use regexp wildcards such as .* - in this language it is written %.%#.

OK, so it is not perfect. But it's not impossible either and makes sense in its own way. Get stuck in, read the help and stop complaining! :-)

rq
"Another thought: How would you map numbers to mean line numbers, or strings to mean files or error messages if you were to use just a normal regexp? By group position?"With named capture groups I guess. Good explanation anyway.Do you have to match the entire contents of the line? Or will it try to match it anywhere like a regex?
flukus
I've commented on named capture groups there. The match requires you to be completely precise. Anything extra that appears in the message that you have not contemplated would cause the match to fail. To get around that you could use the wildcard idiom `%.%#` as you would use `.*` in a regex.
rq