views:

599

answers:

5

I was listening to the Stack Overflow Podcast #34 (59:00 - 1:02:00) and Joel mentioned the difficulty of pointers and recursion. He also mentioned thinking in two levels of abstraction.

He also mentions these concepts in his Peril of JavaSchools article.

.... But when you struggle with pointers, your program produces the line Segmentation Fault and you have no idea what's going on, until you stop and take a deep breath and really try to force your mind to work at two different levels of abstraction simultaneously.

I came across this two years ago well before taking the C++ class at the local university. When I took C++ I thought I understood pointers instantly. I started learning programming with PHP. I read almost the entire manual for PHP 5.1.2 before driving in. I learned how php handles variable references in some massive lookup table. Pointers seemed to me to simply work as a big lookup table that points to physical memory. (Which mixes processor instructions with data and makes those enjoyable buffer overflow exploits so easily possible.) My Computer Architecture class stopped just short of writing working assembly but we did get a 2 hour crash course in it. The classes available are an odd mix of hard theory and practical programming. I have not learned anything below C++ in abstraction and I really didn't learn much about assembly.

However I still can not understand what Joel is talking about with "two levels of abstraction".

I guess I have a number of questions. I don't know if I have a wrong view of pointers. Am I gifted in some bizarre way that I just "get it". Am I hopelessly fooling myself into thinking I actually know something? I've read some of the stuff on Understanding Pointers but it hasn't really pointed to anything I don't think I already understand.

It's driving me nuts. It all stems from me not understanding what Joel is talking about when he mentions "two levels of abstraction". I feel like I'm missing some important critical concept in computer science.

Help me out! What is the "two levels of abstraction" that Joel keeps talking about?

+14  A: 

I think he's getting at the concept that pointers have values at two abstract levels. In other words, they have:

  • a value (level-1) that is the pointer itself.
  • a value (level-2) that is what they point to.

You can change both those values of a pointer independently.

A lot of troubles people have is with dreaded code like appendStr() in the following:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void appendToStr (int *sz, char **str, char *app) {
    char *newstr;
    int reqsz;

    /* If no string yet, create it with a bit of space. */

    if (*str == NULL) {
        *sz = strlen (app) + 10;
        if ((*str = malloc (*sz)) == NULL) {
            *sz = 0;
            return;
        }
        strcpy (*str, app);
        return;
    }

    /* If not enough room in string, expand it [could use realloc()]. */

    reqsz = strlen (*str) + strlen (app) + 1;
    if (reqsz > *sz) {
        *sz = reqsz + 10;
        if ((newstr = malloc (*sz)) == NULL) {
            free (*str);
            *str = NULL;
            *sz = 0;
            return;
        }
        strcpy (newstr, *str);
        free (*str);
        *str = newstr;
    }

    /* Append the desired string to the (now) long-enough buffer. */

    strcat (*str, app);
}

static void dump(int sz, char *x) {
    if (x == NULL)
        printf ("%8p   [%2d]   %3d   [%s]\n", x, sz, 0, "");
    else
        printf ("%8p   [%2d]   %3d   [%s]\n", x, sz, strlen (x), x);
}

static char *arr[] = {"Hello.", " My", " name", " is", " Pax",
                      " and"," I", " am", " old."};

int main (void) {
    int i;
    char *x = NULL;
    int sz = 0;

    printf (" Pointer   Size   Len   Value\n");
    printf (" -------   ----   ---   -----\n");
    dump (sz, x);
    for (i = 0; i < sizeof (arr) / sizeof (arr[0]); i++) {
        appendToStr (&sz, &x, arr[i]);
        dump (sz, x);
    }
}

which outputs the following. Note how the level-1 value (e.g., 0x6701b8) changes independently of the level-2 value (e.g., "Hello. My name").

 Pointer   Size   Len   Value
 -------   ----   ---   -----
     0x0   [ 0]     0   []
0x6701b8   [16]     6   [Hello.]
0x6701b8   [16]     9   [Hello. My]
0x6701b8   [16]    14   [Hello. My name]
0x6701d0   [28]    17   [Hello. My name is]
0x6701d0   [28]    21   [Hello. My name is Pax]
0x6701d0   [28]    25   [Hello. My name is Pax and]
0x6701d0   [28]    27   [Hello. My name is Pax and I]
0x6701f0   [41]    30   [Hello. My name is Pax and I am]
0x6701f0   [41]    35   [Hello. My name is Pax and I am old.]

Re your comment:

Am I gifted in some bizarre way that I just "get it"?

I think that, after a while, your brain just starts thinking like a machine and it's quite possible that other things, like social skills, may suffer as a result :-)

It's the same effect when you start doing cryptic crosswords - after a while they seem easier because, despite the weird clues, you know an answer is right because there's two ways to get from the clue to the answer. But you find you can no longer do quick crosswords at all.

paxdiablo
I almost think I get **str... A pointer to an array? Not quite sure how that works.
epochwolf
** means a pointer to a pointer. In this case the string "is" an array of characters so char **str simply means a pointer to a pointer to some area where there there are character(s) stored. Or just "a pointer to a pointer to a string". By sending in the pointer to the pointer, the callee can replace the pointer for the caller (which he does here) and redirect it to somewhere else.
Fredrik
Thanks for a really detailed response. I'll have to read through this a few more times before I can really make sense of it. I think you touched on what I don't get yet. I understand on some abstract level what pointers do but I don't quite get the C code you posted. :) (Nice to know that this was both simple and difficult. I was missing something because I've never had to deal with C.)
epochwolf
+4  A: 

This quote from the same article seems to get to the point:

Pointers and recursion require a certain ability to reason, to think in abstractions, and, most importantly, to view a problem at several levels of abstraction simultaneously. And thus, the ability to understand pointers and recursion is directly correlated with the ability to be a great programmer.

I think he's just talking in general terms about how difficult it is for some people to understand concepts like pointers and recursion. Specifically, with pointers you need to understand these two levels of 'what's going on':

  • At one level, you have the fact that your pointers are a variable like any other, which hold a value
  • But you also need to understand that the value it holds is a memory address which refers to an area of memory holding a value

There are two levels you have to get your head around. For some people this is fairly easy, because they're able to handle these levels of abstraction ok.

I think that anyone, when explained what pointers are, would be able to get the general idea. But I think that only a certain type of person would find it easy to use them in day-to-day work, because doing that requires you to continue to think in two different levels: the pointer itself which is the address, and the value it 'points' to. Some people find this hard, and Joel's argument is that those people would not make very good programmers.

thomasrutter
+1  A: 

There is another article by Spolsky that should help you out with two or more levels of abstractions - The Law of Leaky Abstractions

grigory
A: 

The reference to recursion specifically makes me think of recursive data structures, such as linked lists and trees. In the simplest implementation of such structures, a pointer to a node represents two things at once: (1) an entire structure (e.g., a sub-list or sub-tree), and (2) the single node at the head or root of the structure. These interpretations are on different levels of abstraction, and to use pointers effectively you need to understand them both and switch between them easily.

Nathan Kitchen
+1  A: 

To understand the concept of pointers and to understand pointers in a live context are, unfortunately, often two different things.
I think the key issue in this question is understanding the concept of abstraction. All programming is dealing with abstraction, and abstraction on many many levels. All good programming designs are designed with good abstractions. The thing is, if a pointer to an abstraction layer (e.g. object), or inside of an abstraction layer, suddenly is invalid, you break that level of abstraction and have to deal with the underlying abstractions as well.

E Dominique