views:

2252

answers:

34

Are Regular Expressions a must for doing programming?

+23  A: 

I would say no, they are not a must. You can be a perfectly good programmer without knowing them.

I find I use Regular Expressions mostly for one-off tasks of data manipulation rather than for actually putting in application code. They can be handy for validating input data but these days your controls often do that for you anyway.

Dave Webb
+1 for data manipulation!
amartin
+1 Agreed, the most useful things I have done with regex are rather complex search/replace operations in Visual Studio where they saved me from loads of manual labour.
Fredrik Mörk
They can be extremely powerful in XML validation against a schema however. For text manipulation, searching, and validation, there is sometimes no other way.
Trampas Kirk
+5  A: 

No. You can be programming for years without touching regular expressions. Of course it will mean that for some cases where someone who knows RE:s would use them, you would do something else. There is always more than one way to solve a particular problem, and regular expressions is just one way (a very efficient, and perhaps therefore popular way) of expressing patterns.

unwind
+2  A: 

In a word, No.

But they can certainly be the right tool for the right job and are worth learning for those string matching operations where they work best. However, just because you've got a good, big hammer, it doesn't mean you should use it to crack every nut.

Sam Meldrum
+41  A: 

Yes. You can manage without them, but you really should learn at least the basics as most computing tasks could use them. You will save a lot of pain and hassle in the long run. Regex's are much easier than you think once you get over the initial 'wtf' stage.

Chris Huang-Leaver
+1 for "once you get over the initial 'wtf' stage" when I got passed that stage RegEx became almost second nature and now only those stupidly long (email) RegExs are mind numbing.
Unkwntech
Yeah I used to think that Regex looked like someone just slammed their hand on the keyboard. But once you learn what each character matches it becomes 2nd nature.
John Isaacks
A: 

Depending on your field there are certain problems that lend themselves to regexes - or rather the other way around: the solution /not/ using regular expressions is extremely clumsy. email verification/url verification/minimum password strength/date parsing come to mind.

BuschnicK
e-mail and url verification is something you musn't do using regexp, as it's nearly impossible to implement all cases.
Georg
+1  A: 

Well, in computer science theoretical field it's very strong and useful "equipment", since with it you are able to define regular languages and identify with it NFA or even DFA, therefore prove some difficult theorem in computation theory or finite automate and formal languages field. In practical programming it's very useful as well, since using it you are able to perform a complex string manipulation in relative easy way.

Artem Barger
A: 

Must it is not. Though there is come perception that a good programmer should know it, i wouldn't say so. When the time comes and you'll need it, you'll just use it. Anyway, give it a six months not using it and you won't remember any expression options.

Like everything factual in programming, you learn it, you forget it, you relearn it again.

User
+2  A: 

At least knowing that regular expressions exist and what they can be used for is an absolute must. Otherwise you will be in danger of reinventing the wheel in many situations. If you know about their existence you can go into the details once you have to apply them. BTW, the theory behind regular expressions is quite interesting :-)

jens
+14  A: 

Not at all. Anything that you can do with regular expressions is entirely possible to do without them.

However, it's a powerful pattern matching system, so some things that is quite easy to accomplish with a simple regular expression pattern takes a lot of code to do without it.

For example, this:

s = Regex.Replace(s, "[bcdfghjklmnpqrstvwxz]", "$1o$1");

needs a bit more code to do without a regular expression:

StringBuilder b = new StringBuilder();
foreach (char c in s) {
   if ("bcdfghjklmnpqrstvwxz".IndexOf(c) != -1) {
      b.Append(c).Append('o').Append(c);
   } else {
      b.Append(c);
   }
}
s = b.ToString();

Or if you are not quite as experienced a programmer, you could easily create something that is even more code and performs horribly bad:

string temp = "";
for (int i = 0; i < s.Length; i++ ) {
   if (
      s[i] == 'b' || s[i] == 'c' || s[i] == 'd' ||
      s[i] == 'f' || s[i] == 'g' || s[i] == 'h' ||
      s[i] == 'j' || s[i] == 'k' || s[i] == 'l' ||
      s[i] == 'm' || s[i] == 'n' || s[i] == 'p' ||
      s[i] == 'q' || s[i] == 'r' || s[i] == 's' ||
      s[i] == 't' || s[i] == 'v' || s[i] == 'w' ||
      s[i] == 'x' || s[i] == 'z'
   ) {
      temp += s.Substring(i, 1);
      temp += "o";
      temp += s.Substring(i, 1);
   } else {
      temp += s.Substring(i, 1);
   }
}
s = temp;
Guffa
i find that regular expression can be replaced using other code, but they usually become very long and hard to maintain.
動靜能量
Obviously, it's tautological that a regex can always be replaced by code, but the absurdly trivial example above doesn't go very far. Any reasonably complex regex is only going to be replaced by coding up something like the underlying NFA, something's that's not going to be easy for a developer who doesn't even know regexes.Many times, I've seen developers without basic parsing theory try to code up simple parsers and lexers. It's never pretty, and it's usually amazingly time-consuming.
tnyfst
The only restriction you can put on an element in an XSD other than requiredness and min/max occurrences is a regex. So it's not entirely true that you can substitute some code in its place. Sure you could do the validation outside the XSD, but you *cannot* do validation in the XSD with anything *but* a regex.
Trampas Kirk
+1  A: 

Probably not. But they are really easy to learn. At least the basics (the stuff all the regex engines do) are quickly taught. I learnt it in a chat window from another guy in like 30 minutes...

Daren Thomas
A: 

No.

Depending on what you're trying to achieve, Regex can be useful. But I would hazard that 80% or more of programmers never use Regex, some 15% or so only occasionally (and have to Google it) and only a small % of the remainder ate actually Regex Ninjas.

I have found Regexr is pretty good for the rare occasions I use Regex.

Also, someone will mention a certain quote from jwz within the next minute or so...

Colin Pickard
+2  A: 

Actually, my feeling is that it is a must...

For example, I was looking at why a portion of our YouTube video didn't work... and it turned out the links for those videos are

http://ca.youtube.com/v/raINk2Ii1A4 (not actual URL, just as an example)

instead of

http://www.youtube.com/v/raINk2Ii1A4

Another programmer earlier used "substr()" to extract the youtube video ID, and because of the ca.youtube.com portion, the ID was extracted wrong.

So to my feeling, regular expressions are very important and without that, hidden bugs can be introduced more often than usual.

But I actually met 3 developers before, one was a very good web applications developer, one had a Master of Science degree from a prestigious Silicon Valley top university, and one was a high-profile master grad, and it turned out they all didn't know regular expressions. That was a bit surprising to me.

動靜能量
Regular expressions, being so handy in your particular situation, doesn't mean that "regular expressions" always are a must. Surely they make life easier, especially in this case, but still one can do without them.
Cloud
Jeff Atwood wrote up a nice article about Regular Expressions - explaining some of the best and worst in them: http://www.codinghorror.com/blog/archives/001016.html
scraimer
Then again could you just split the string on "/" and pull the last element... like everything else there are many ways to do the same task. It's the foresight that's needed for things such as your substring.
Matthew Whited
+1  A: 

No. I'm terrible at regular expressions myself, and still I'm a bad programmer. Wait. What?

On a more serious note: I don't know regular expressions, but hardly ever need them. If I really need one, for instance when I need to validate user input like Dave mentions, I ask a colleague.

There are so many things that are valuable to know / learn as a programmer, but I'd dare say regular expressions is far from being anywhere near the top of that list.

Razzie
I disagree. About the only skill other than a regex that is so globally applicable is typing. Good text editors support regex, most IDEs do, XML validation uses them, they're a great boon to parsing log files. They're useful in linux, windows, a great many programming languages. Not knowing them at all is a pretty severe handicap IMO.
Trampas Kirk
Well, of course regex come in handy sometimes. Makes knowing them you a better programmer? Yes. Is it a must for every programmer? No. Like I said, there are many things valuable for a programmer, but imho regex is just not on the top of that list. SQL (or databases in general), UML (or design skills in general), Unit testing (yes, it is a skill as well) - all far more important imho.
Razzie
I really don't agree with this simply because you stated "I ask a colleague". What if all your colleagues had the same attitude, then who would you turn to?
Kibbee
If I can't get a particular layout right in IE6, I also ask a colleague who is an expert in ceoss-browser HTML. That's why you have certain area's of expertise. Mine isn't HTML, and I don't want to become an expert either (even though my branch of work is web development - I focus more on the server side). The same goes for regular expressions. If you ever need some information about a certain topic, then that doesn't mean that you have to learn everything there is to it, right? If it did, my head would explode with information right now (well, sometimes I feel it does, anyway).
Razzie
A: 

NO.

A lot will depend on the type of project you have. For example validating information might lend itself to regular expressions, but you shouldn't try to force them into every project your on.

kevchadders
+66  A: 
Unkwntech
Yeah, the comic was obligatory.
Unkwntech
Great answer and funny comic, thanks!
Moayad Mardini
It's probably a good idea to attribute the image in the post to xkcd.com
StompChicken
Tempted to downvote for not adding the alt text.
Michael Myers
Since the answer seems popular I've added some more info and resources.
Unkwntech
"Wait, forgot to escape a space. Wheeeeee[taptaptap]eeeeee."
Alex Barrett
I tried to add the alt text, but StackOverflow and images and me never seem to mix correctly. I must be missing something.
Chris Lutz
Works for me.
Michael Myers
It works for me too now. Images just don't like me. :'(
Chris Lutz
TBF, in Perl you'd just use an existing module and use that :p. The valid email address regex is literally several pages long. XKCD-- for not getting it right.
Kent Fredric
yeah the email regex is just scary, for anyone who hasn't seen it: http://tinyurl.com/3qy5q
Unkwntech
BTW that regex will match ANY technically valid email address, valid in this case means that it conforms to the RFCs
Unkwntech
I don't think the comic is talking about an email address. (Not that street addresses are any easier.)
Michael Myers
@mmyers thats correct its not, and no they are not.
Unkwntech
RegExBuddy is pretty pointless, I had an easier time typing RegExp by hand rather than use RegExBuddy. To each their own.
Dmitri Farkov
+1 for setting img title=""!
Pete
+1  A: 

I guess it is not a must but they will ease your life and save you so much time.

If you dont know how to use regular expressions you dont know what you are missing. But just looking at a person using them to complete a task makes you feel that it is a skill you should definitely have.

anna
A: 

Regular Expressions is a powerful pattern matching language. And it is not limited to text strings. But as always, your code, your call.

Nick D
A: 

Simply, no. It all depends on what your program is set out to achieve.

Of course knowing what a RegExp is and a basic understanding of how they work can be useful in the future.

James Brooks
+4  A: 

If you care about developing a career as a software engineer, then yes. I hire software engineers and if they don't know the basics of using regular expressions, or have never heard of them, then I wonder how much experience they actually have across the entire spectrum of programming techniques. What else don't they know?

Most of the comments above say 'no, you can solve the problem in other ways' and they also mostly say the alternatives are more code and take longer to write... now think maintainability and how easy this bespoke code would be to change... Use a regular expression - then it's just a single line of code.

edr
+9  A: 

Let me put it this way, if you have regular expressions in your toolkit, you'll save yourself a lot of time and energy. If you don't have them, you won't know what you're missing out on so you'll still be happy.

As a web developer, I use them very often (input validation, extracting data from a site etc).

EDIT: I realized it might help you to look at some common problems that regex is used for by looking at the regex tag right here on stackoverflow.

aleemb
A: 

I agree with the others that it's probably not a must, but it's very helpful to have at least a basic understanding. I have a RegEx cheat sheet posted in my cube that I find very helpful.http://regexlib.com/CheatSheet.aspx

Tim
A: 

Understanding regular expressions is not a must. However, it is an effective tool for processing text. If you work on projects that manipulate text, you will eventually run across them.

Regular expressions come with a variety of challenges, whether you are using them or just supporting code that has them. Be aware that there are a variety of syntax flavors. Different libraries and languages often have slightly different syntax rules. Regular expressions, as they become more complicated can easily transition from a simple pattern matching tool to a piece of magic, write only code that cannot be easily understood. And, like most text processing tools, they can often be difficult to troubleshoot or change (e.g. you have a corner case that no long fits the features of the tool).

As with all parsing code, I recommend a lot of unit tests. In particular, watch out for edge conditions, repeated text patterns and unusual inputs.

Jim Rush
A: 

Definitely not, I (like many people) have been programming for years without touching them. That said, once you get to know them you start to see where they might have been useful in the past :-)

I'd say - just read up on the basics so you know what RegExes are and what you can do with them, then if you ever find they might be useful you can grab a tutorial / reference website like http://www.regular-expressions.info/ and jump right in.

Led
A: 

If you're developing a new product, I would suggest you avoid them, or at the very most use them sparingly and judiciously.

If you're maintaining a product that already uses regexp's you are left with no choice.

It helps to atleast be able to recognize a regular expression so if you encounter a particularly obfuscated piece of code you know the right search term to find a referance card.

Alterlife
A: 

No more so than, say, knowing HTML or being able to use a relational database. Strictly speaking, no, they're not a requirement for doing programming--- they might be essential and fundamental in some jobs, and yet irrelevant in others. You're unlikely to use regular expressions (or HTML or SQL, for that matter) while writing a device driver for a new Ethernet chip. In my area I use regular expressions occasionally in production code, much more often in ad-hoc scripts to massage reports etc. I've worked on one project where they were a central feature (an application to analyse free-form text to look for certain key phrases to produce a compiled rule set).

araqnid
A: 

Regular expressions are important at least to learn if not to use.

First, you must be able to read and understand others' regular expression code.

Second, basic regular expressions correspond to finite automata (by the Kleene theorem), which makes them fundamentally important for algorithm design.

Actually, there is a cheat sheet skirt for girls

http://store.xkcd.com/xkcd/#RegexCheatSkirt

If you happen to be a girl, this might be a fantastic learning opportunity.

volodyako
+2  A: 

No... and Yes,

This is very much like one of those, "Should I learn C" questions. No regular expressions are never necessarily the only way to do something. But they are often a helpful abstraction that simplifies code and can (I really think) even make it more readable. Maybe is because I love Jeff Friedl's Mastering Regular Expressions or maybe its because I do allot at work in perl. But for whatever reason regular expressions are my go to tool. It now seems easier for me to use a regex then most other string manipulation techniques.

Copas
RegEx + Perl = almost unreadable :)
Unkwntech
Perl is made AROUND regular expression. Can't really use PERL and not use regular expression :). Well you can, but you'de be missing out a lot.
David Brunelle
+1  A: 

Understanding at least at the lowest level what regular expressions are/can do is immensely important. If you understand the concepts behind and NFA then you will understand other problems much better.

As for begin good at Regular Expressions, I would say not necessary but really valuable. The fact is every Regular expression engine is different, so even if you've mastered one you may not be able to quickly do it elsewhere.

Tom Hubbard
"...The fact is every Regular expression engine is different..." from my experience this is not true, most implimentations do use a PCRE based library and therefor *MOST* of the time they can be carried back and forth and in addition with a program like RegexBuddy one can easily 'translate'.
Unkwntech
+5  A: 

I would say yes.

They're so universally useful that it's a pretty significant handicap to be entirely without the ability to at least read & write simple ones.

Languages that Support Regular Expressions

  • Java
  • perl
  • python
  • PHP .
  • C#
  • Visual Basic.NET
  • ASP
  • powershell
  • python
  • javascript
  • ruby
  • tcl
  • vbscript
  • VB6
  • XQuery
  • XPath
  • XSDs
  • MySQL
  • Oracle
  • PostgreSQL

IDEs and Editors that Support Regular Expressions

  • Eclipse
  • IntelliJ
  • Netbeans
  • Gel
  • Visual Studio
  • UltraEdit
  • JEdit
  • Nedit
  • Notepad++
  • Editpad Pro
  • vi
  • emacs
  • HAPEdit
  • PSPad

And let's not forget grep and sed!

As an employer, which would you rather have, a good programmer that - once in a while - will have to manually find/replace some set of similar strings across thousands of source files and require hours or days to do it, or a good programmer that - once in a while - spends five, or even ten minutes crafting a regex to accomplish the same thing that runs in the time it takes them to go get some coffee?

Real World Practical Usage in this very Answer

In fact, I actually used a regex in crafting this post. I initially listed the languages that support it in comma delimited prose. I then rethought it and changed the format to a bulleted list by searching for the expression (\w+), and replacing it with \n* $1 in JEdit. And the more experience you get with them, using them will become more and more cost effective for shorter and shorter sets of actions.

Trampas Kirk
+3  A: 

There is a great book out there written by Jeffrey Friedl called Mastering Regular Expressions. It gave me insight and was a real joy to read.

Even though I do not use regexes that often, they recently came in handy:

  • Input: Some CSV dictionary file with some kind of loose format, multiple translations, sayings, etc.

  • Output: Nice JSON.

  • First thought: Write a short grammar to parse all possible fields and values.

  • First attempt: Wrote a grammar, but there were some rough edges, mainly special cases, which occured in just 0-1% of the data. Making a grammar that catches all would have been too-much-design.
  • Second attempt: I used a simple grammar catching the main fields and then passed over the rest to a routine, which applied some regular expressions. It was fast, conceptually easier than a full grammar and fun to write, too.

  • Summary: Regular expressions saved me hours and actually helped me seeing the special cases in the data and how and where they appeared.

Are they worth learning? Yes.

A must? No, but I know almost no one in the field whose not familiar with them.

Difficult to learn? Not at all.

The MYYN
A: 

You need them if you do a lot of string manipulation, search and replace, etc.

A: 

No, you always have two other options for suitable requirements.

  1. Ask a friend who knows regexes.

  2. Post the problem on SO.

le dorfier
A: 

I think it depends on what you are going to do. They are a must for mod_rewrite. But for the most part I agree that you can get around without them. But they can save you a lot of time for some tasks that would otherwise take a lot of tedious time.

John Isaacks
A: 

No, but better to be prepared, use Regular Expression Builder for training.

lsalamon