tags:

views:

602

answers:

7

I am trying to search for the substring "abc" in a specific file in linux/bash

So I do:

grep '*abc*' myFile

It returns nothing.

But if I do:

grep 'abc' myFile

It returns matches correctly.

Now, this is not a problem for me. But what if I want to grep for a more complex string, say

*abc * def *

How would I accomplish it using grep?

+1  A: 

Try grep -E for extended regular expression support

Also take a look at:

The grep man page

Brian
+3  A: 

The "star sign" is only meaningful if there is something in front of it. If there isn't the tool (grep in this case) may just treat it as an error. For example:

'*xyz'    is meaningless
'a*xyz'   means zero or more occurrences of 'a' followed by xyz
anon
The * is not meaningless; it just doesn't have its usual meaning (of repetition) but means "I'm a star". It would match a line containing a star followed by x, y, and z.
Jonathan Leffler
@Jonathan It depends on the tool.
anon
A: 

'*' works as a modifier for the previous item. So 'abc*def' searches for 'ab' followed by 0 or more 'c's follwed by 'def'.

What you probably want is 'abc.*def' which searches for 'abc' followed by any number of characters, follwed by 'def'.

Conspicuous Compiler
+5  A: 

The asterisk is just a repetition operator, but you need to tell it what you repeat. /*abc*/ matches a string containing ab and zero or more c's (because the second * is on the c; the first is meaningless because there's nothing for it to repeat). If you want to match anything, you need to say .* -- the dot means any character (within certain guidelines). If you want to just match abc, you could just say grep 'abc' myFile. For your more complex match, you need to use .* -- grep 'abc.*def' myFile will match a string that contains abc followed by def with something optionally in between.

Update based on a comment:

* in a regular expression is not exactly the same as * in the console. In the console, * is part of a glob construct, and just acts as a wildcard (for instance ls *.log will list all files that end in .log). However, in regular expressions, * is a modifier, meaning that it only applies to the character or group preceding it. If you want * in regular expressions to act as a wildcard, you need to use .* as previously mentioned -- the dot is a wildcard character, and the star, when modifying the dot, means find one or more dot; ie. find one or more of any character.

Daniel Vandersluis
I think the questionner is confused about the difference between shell wildcards and regular expressions. I also suspect that the more complicated expression would be: grep 'abc .* def' (at least one space present - possibly two as I wrote).
Jonathan Leffler
Actually, the questionner seems not to understand that 'abc' is not the same thing as '^abc$' :-D
Massa
Yes, i was confused between glob and full regular expressions. I use the * without a dot to mean matching anything on the shell.
Saobi
A: 

The dot character means match any character, so '.*' means zero or more occurrences of any character. You probably mean to use '.*' rather than just '*'.

Ah, crap,

smcameron
A: 

Use grep -P - which enables support for Perl style regular expressions.

grep -P "abc.*def" myfile
Artem Russakovskii
+1  A: 

The expression you tried, like those that work on the shell command line in Linux for instance, is called a "glob". Glob expressions are not full regular expressions, which is what grep uses to specify strings to look for. Here is (old, small) post about the differences. The glob expressions (as in "ls *") are interpreted by the shell itself.

It's possible to translate from globs to REs, but you typically need to do so in your head.

unwind
It's only a glob if it's parsed by the shell. Since he is preserving the search string inside of single quotes, the shell leaves the string alone, and passed it intact in argv to grep.
Conspicuous Compiler