tags:

views:

987

answers:

20

In the Stack Overflow podcasts, Joel Spolsky constantly harps on Jeff Atwood about Jeff not knowing how to write code in C. His statement is that "knowing C helps you write better code." He also always uses some sort of story involving string manipulation and how knowing C would allow you to write more efficient string routines in a different language.

As someone who knows a little C, but loves to write code in perl and other high-level languages, I have never once come across a problem that I was able to solve by writing C.

I am looking for examples of real-world situations where knowing C would be useful while writing a project in a high-level/dynamic language like perl or python.

Edit: Reading some of the answers you guys have submitted have been great, but still doesn't make any sense to me in this regard:

Take the strcat example. There's a right way and a wrong way to combine strings in C. But why should I (as a high-level developer) think that I am smarter than Larry Wall? Why wouldn't the language designers write the string manipulation code the right way?

+2  A: 

It's hard to quantify exactly, but having an understanding of C will give your more insight into how higher-level language constructs are implemented, and as a consequence you'll be better able to use the constructs in an intelligent manner.

Adam Rosenfield
I would agree with you 100%, except that what I'm looking for is a concrete example of how it would be useful. Joel Spolsky (in the podcast) always claims it would allow you to write more efficient code or something. That's what I'm trying to figure out.
Eric Ryan Harrison
+3  A: 

To give you a specific reason: having to write my own Garbage Collection routines has helped my write better code.

I don't think I have ever found a problem that I haven't been able to solve with a higher-level language; but started by learning C, it has instilled in me quite a number of excellent development practices. Knowing how the rudimentary parts of the flow of an application work will enable to you be able to look at your own code and get a good visual of how the data flows, and where it is stored. This then leads to a better understand of how to track down leaking memory, slow disk reads, poorly constructed caches, etc.

Keeping track of Pointers... that's another one that comes to mind.

drovani
pointers are vilified; but they're the wheels of the CPU's bike.
Javier
Here's a problem you can't solve with a higher level language: bit fiddling in Lua
George Jempty
+3  A: 

Classic examples are things involving lower level memory management, such as the implementation of a linked list class:

struct Node
{
    Data *data;
    Node *next;
}

Understanding how the pointers are used to iterate the list, and what they signify in terms of the machine architecture will allow you to better understand your high level code.

Another example which Joel was referring to was the implementation of string concatenation, and the right way to create a string from a set of data.

// this is efficient
for (int i=0; i< n; i++)
{
    strcat(str, data(i));
}

// this could be too, but you'd need to look at the implementation to be sure
std::string str;
for (int i=0; i<n; i++)
{
  str+=data(i);
}
1800 INFORMATION
strcat in a loop like that is an example of a really inefficient way to build a string. With C strings, you'll want to hang on to the end of the string after you cat onto it, so you don't have to find it again, in a case like your example. The std::string loop is likely _more_ efficient.
Logan Capaldo
Hah yeah right, I was thinking from the point of view that it wouldn't need to perform any additional allocations
1800 INFORMATION
I guess at least these discussions help prove the point :)
Logan Capaldo
A: 

I see it like this , everything boils down to C in a crossplatform level, and assembly in a platform specific way. So it's like being a crosscountry Rally racer, and C is basic automotive mechanics, you can be a great driver but when you get into trouble knowing C means you can probably get yourself back in the race, if not you're stuck calling the mechanics. And assembly is what the mechanics and manufacturers know, it's a worthy investment if that's what you want to do, otherwise you can just trust the mechanics.

For specifics think about memory management, hardwar drivers, physics engines, high performance 3d graphics, TCP stacks, binary protocols, embedded software, creating high level languages like Perl

Robert Gould
+1  A: 

Do you use arrays much ? and do you come across situations where you need items to be stored in memory without knowing how many of them (i.e. based on a query from the database?) then I suppose C would teach you great things like stacks, structs and link lists which might help you. Regards, Andy

Andy
+21  A: 

The classic example that Joel Spolsky uses is on misuse of strcat and strlen, and spotting "Shlemiel the painter" algorithms in general.

It's not that you need C to solve problems that higher-level languages can't solve, it's that knowing C well gives you a perspective on what's going on underneath all those levels of languages that allows you to write better software. Because just such a perspective helps you avoid writing code which is, unknown to you, actually O(n^2), for example.

Edit: Some clarification based on comments.

Knowing C is not a prerequisite for such knowledge, there are many ways to acquire the same knowledge.

Knowing C is also not a guarantee of these skills. You may be proficient in C and yet still write horrible, grotty, kludgy code in every other language you touch.

C is a low-level language, yet it still has modern control structures and functions so you aren't always getting caught up in the fiddly details. It's very difficult to become proficient at C without gaining a mastery of certain fundamentals (such as the details of memory management and pointers), mastery of which often pays rich dividends when working in any language.

It's always about the fundamentals.

This is true in many pursuits as well as software engineering. It is not secret incantations that make the best programmers the best, rather it is a greater mastery of the fundamentals. Experience has shown that knowledge of C tends to have a higher correlation to mastery of certain of those fundamentals, and that learning C tends to be one of the easier and more common routes to acquiring such knowledge.

Wedge
right. to make great programs you need the whole picture, and a high level language is a big part of it. if you don't know C (or even better, assembly) you miss that whole part.
Javier
Knowing C is not a prerequisite to identifying a Sclemiel algorithm. 8 years ago, when I only had 3 years experience, I was porting 0.5 million records from an Oracle table to a MySQL table using Perl/DBI, I identified my own Shlemiel algorithm, fixed it, and moved all the records in 0.5 hours
George Jempty
He's not saying that it's a prerequisite, just that C can help you see such things.
PintSizedCat
@PintSizedCat: Can't see how knowing C helps you spot bad algorithms. Bad Schlemiel algorithms exist outside of C. Indeed, knowing C often confuses people because they think they can micro-optimize the ++ operation to speed up a dumb Schlemiel algorithm.
S.Lott
@S.Lott that would be an issue with not knowing your language and not an issue with knowing C
Jonas
Thanks. Great answer. Basically you're saying what I always felt was the right answer: That knowing how things work is more important than just "knowing C". So basically, Joel is just an old fuddy-duddy... ;)Great answer. Thanks.
Eric Ryan Harrison
+2  A: 

Knowing C is really not worth much. Many of us who know C deeply like to think that all that deep insight is valuable and important.

Some of us who know C can't think of a single specific feature of C that's helpful to know about.

Knowing how pointers work in C (especially with C's syntax) isn't all that helpful. In a high-level language your statements create objects and manage their interaction. Pointers and references are -- perhaps -- interesting from a hypothetical point of view. But the knowledge has no practical impact on how you use Java or Python.

The higher-level languages are the way they are. Knowing how doesn't change those languages; it doesn't change how you use them, debug or test them.

Knowing how to create or manipulate a linked list has no earthly impact on Python list class definition. None.

Knowing the difference between Linked List and Array List might help you write a Java program. But the C implementation doesn't help you choose between Linked List and Array List. The decision is independent of knowing C.

A bad algorithm is bad in every language. Knowing inner mysteries of C doesn't make a bad algorithm any less bad. Knowing C doesn't help you know the Java collections or the Python built-in types.

I can't see any value in learning C. Learning Fortran is just as valuable.

S.Lott
"A bad algorithm is bad in every language." right, and knowing C lets you identify which algorithms a high level language uses under the hood, letting you make the right choice. without C, you're only left with the language docs, which almost never mentions them (just say that the're "fast enough")
Javier
How can a Java programmer be effective without knowing what references are and at least a bit of how they work? Without that knowledge, even understanding the difference between someString.equals( someOtherString) and (someString == someOtherString) becomes quite difficult.
Michael Burr
Knowing what references are IS important. Knowing C is not a way to learn that. You can learn that with the same simple pictures that the C programmers used when they were learning C. C itself doesn't help.
S.Lott
I'm not sure how C helps you identify a bad sort algorithm. A dumb sort is just as dumb in Java as it is in Python, irrespective of C. In choosing between HashMap and TreeMap, C does you no good at all.
S.Lott
@S.Lott: I see - I was confused by, "Pointers and references are ... interesting from a hypothetical point of view." I read that as if it were "pointers and Java references...", but I see I was misreading.
Michael Burr
@Michael Burr: I rewrote the answer based on your comment.
S.Lott
+2  A: 

Knowing C helps you to write better code in C. I guess that the example of Joel Spolsky is of little use in C++ or Objective-C where specific classes for manipulating strings exist and have been crafted with performance in mind. Moreover, using C tricks in other languages may be couter productive.

Nevertheless, C knowledge is very helpful to understand general concepts in other languages and what is behind the hood in many situations.

mouviciel
+6  A: 

It's a mistake to assume that learning C will somehow automatically give you a better understanding of low-level programming concerns. In a lot of cases even C is too high level to give you a good understanding of efficiency concerns.

A classic is i++ versus ++i. It's over-cited, so perhaps most people know the implications about performance between these two operations. But learning C wouldn't magically teach you this by itself.

I guess I understand arguments about strings. When string operations are made deceptively simple, people often use them in inefficient ways. But again, knowing that strncat exists doesn't give you a full appreciation for the efficiency concerns. A lot of C programmers probably haven't even thought about the fact that strncat has to do a strlen operation internally.

Even using C, it's important to understand what's going on behind the scenes if efficiency is a concern. People who know C tend to view things in a progression. Assembly and machine code are the building blocks of C, while C is a building block of higher level languages.

This isn't specifically true, but it's obvious that C is "closer to the metal" than many higher level languages. This has at least two effects: efficiency concerns aren't as hidden behind implicit behavior, and it's easier to screw up.

So you want a specific example of how knowing C gives you an advantage. I don't think there is one. I think what people mean when they say this is that knowing what's going on behind the scenes in whatever language you're happening to write for helps you make more intelligent decisions about how to write code. However, it's a mistake to assume that C is "what's going on behind the scenes" in Java, for instance.

Dan Olson
I wish I could thumbs up an answer more than once... This is the best answer I've seen on this matter.
Jonas
A: 

You cannot write an OS kernel in Perl; C would be a much better choice for that, because it is low-level enough to express everything the kernel should do, and portable enough to let you port your kernel to different architectures

dmityugov
Yes, but I'm not trying to write a kernel in Perl. I'm just trying to get my job done. I already know C, but I've never come across a situation where I was writing Perl and thought, "Holy crap, I should write this in C". I always just assume Larry Wall is smarter than me.
Eric Ryan Harrison
A: 

Knowing C is not a requirement to being able to effectively use higher-level languages, but it certainly can help ones general understanding of how computers and software work - I think it's similar to an assertion that knowing some assembly language or computer architecture/hardware logic (and/or/nand gates, etc) can help a C programmer be a better programmer.

Sometimes in order to solve a problem it helps to know how things are working 'underneath' what you're doing.

I don't think this means a programmer must know C in order to be a good programmer, but I think that knowing C can be helpful to almost any programmer.

Michael Burr
A: 

Not knowing Perl well, I am wondering if it is now possible to distribute processor load to more than one physical core with several threads created in a single program in Perl, without spawning additional processes

dmityugov
A: 

I don't think there can be any specific example.

What learning C does for you is give you an insight, a broadening of the mind, into how computers (and software) work. It's a very abstract thing ..

It doesn't make you write better code in python, it just makes you more of a computer scientist.

The reference that Wedge made to Joel's article mentioning Shlemiel the painter is an interesting one but has no relevance here. That algorithm is not tied to C in any particular way (although it manifests itself in null-terminated strings).

Python's strings are immutable anyway, and completely different from C's model of strings, so I don't quite see the relationship.

I suppose one concrete example is optimizing a parser or a lexer or a program that keeps writing to a string buffer all the time. If you use normal strings instead of a string buffer, you'll run across a problem when you build very large strings.

Consider that:

a = a + b

makes a copy of both a and b. It doesn't change the string that was referenced by a, it creates a new string, allocating more memory, etc.

If a becomes considerably large, and you keep adding small things to it, then Shlemiel the painter will manifest himself.

But then again, knowing this has nothing to do with knowing C, just knowing how your language implements things at the low level. (This is where having an experiece in C will help you).

hasen j
A: 

Technically, all of the deficiencies of C would force you to code around them; making you write more code -> making you more experienced in general. Lacking any portable integer bigger than 32-bits, for example, C has, in the past, made me write my own bignum library.

The lack of implicit memory, resource and error management (garbage collection, RAII, automatically-called constructors/destructors, maybe exceptions) force C users to write a lot of initialization, error-handling and cleanup code. It may just be me, but I'm never tired of writing such code. I go and read the documentation of every external function I call, return to my code and check for every return value and other failure-indicative stuff. It even makes me feel safe!

This last point is probably the biggest one to be made in favor of the argument. You can only write so many malloc()/free() pairs before you start to analyze the lifetime of every single variable you come across in every single language! C++'s automatic-storage objects don't help this disorder, either.

Writing truly portable C code often requires the programmer to be free of a lot assumptions about the host system - think sizeof(), CHAR___BITS, unsigned long, UINT_MAX. While this hasn't helped me write better code in other languages, it has helped me think about possible alternate implementations: how a tiny microprocessor could still run my C code, generating a gazillion RISC instructions for my simple one-line statement. (That is another thing; not many other languages map to and from a given assembly language so easily in my head. Then again, that may just be me.)

Of course, none of these arguments go only for C. @S.Lott has a valid point - Fortran might be an equally good alternative. But there is so much C code around! A whole personal computer system from top to bottom -applications to libraries to drivers to kernel- is available in source code in C. It would be such a waste if you could not read it.

aib
A: 

In Python, say you have a function

def foo(l=[])
  l.append("bar")
  return l;

On some version of Python, available about a year ago, running foo() for times, you'd get a really interesting result (i.e. ["bar","bar","bar","bar]).

It seems that someone implemented the default parameters as a static variable (and without resetting it), so unexpected results happen.

Perhaps my example was contrived - a friend of mine who actually likes Python found this peculiar bug, but the fact of the matter is all of these languages are implemented in C or C++. Not knowing and not understanding concepts that are fundamental to the base language means that you won't have an in-depth understanding of languages that are built on top of that.

I find all the "why bother with C/C++/ASM question silly". If you're inclined enough to learn a language, that means that you're curious enough to get into it the first place. Why stop at just before C?

Calyth
A: 

Knowing C is great because it does nothing behind your back (GC, bounds checking, etc.). It only does exactly what you tell it too. Nothing is implied. Even C++ does things you don't tell it too with RAII (of course, it is implied that the object is destructed when it goes out of scope, but you don't actually write that). C is a great way to learn what goes on 'under the hood' of the computer, without having to write assembly.

Zifre
+1  A: 
foljs
A: 

inefficient code (eg loops of string+=) are typically inefficient in any language. what difference does it make if someone explains why it is inefficient in one language or the other? knowing C, but not realizing that a method is inefficient, is no different than knowing python and not realizing the same.

Dustin Getz
+2  A: 

For the purposes of argument, suppose you wanted to concatenate the string representations of all the integers from 1 to n (e.g. n = 5 would produce the string "12345"). Here's how one might do that naïvely in, say, Java.

String result = "";
for (int i = 1; i <= n; i++) {
    result = result + Integer.toString(i);
}

If you were to rewrite that code segment (which is quite good-looking in Java) in C as literally as possible, you would get something to make most C programmers cringe in fear:

char *result = malloc(1);
*result = '\0';
for (int i = 1; i <=  n; i++) {
    char *intStr = malloc(11);
    itoa(i, intStr, 10);
    char *tempStr = malloc(/* some large size */);
    strcpy(tempStr, result);
    strcat(tempStr, intStr);
    free(result);
    free(intStr);
    result = tempStr;
}

Because strings in Java are immutable, Integer.toString creates a dummy string and string concatenation creates a new string instance instead of altering the old one. That's not easy to see from just looking at the Java code. Knowing how said code translates into C is one way of learning exactly how inefficient said code is.

John Calsbeek
A: 

I think it is worth knowing some low-level language, and there are pragmatic reasons to choose C:

  • It's low-level, close to assembler
  • It's widespread

Understanding the whole stack is valuable. Sometimes you need to debug something's guts. Sometimes you cannot fix a performance problem without low-level knowledge (this is often not the case, e.g., when the performance problem is purely algorithmic, but sometimes it is).

Why is C widely considered the quintessential "bottom of the stack", and not some other language(s)? I think this because C is a low-level programming language, and C won. It has been a while now, but C was not always as dominant. To take just one famous example, the proponents of Common Lisp (which had its own ways of writing low-level code) were hoping their language would be popular, too, and eventually lost.

The following are usually implemented in C:

  • operating systems (Unix variants, Windows, many embedded operating systems)
  • higher-level programming languages (many popular implementations of Java, Python, etc)
  • (obviously) reams of popular open source projects

I'm not a hardware person, but I gather that C has influenced CPU design heavily, too.

So if you believe in understanding the whole stack, learning C is, from a pragmatic perspective, the best choice.

As a caveat, I think it's worth learning assembler, as well. Although C is close to the metal, I didn't fully understand C until I had to do some assembler. It is occasionally helpful to understand how functions calls are actually performed, how for loops are implemented, etc. Less important, but also useful, is having to (at least once) deal with a system without virtual memory. When using C on Windows, Unix, and certain other operating systems, even humble malloc does a lot of work under the covers that is easier to appreciate, debug and/or tune if you've ever had to deal with manually locking and unlocking memory regions (not that I would recommend doing so on a regular basis!)

Jacob Gabrielson