views:

202

answers:

3

I'm working on a programming language that uses C++ as it's target language for now. I'm hitting an exceptionally strange backtrace.

#1  0x08048d09 in factorial (n=0x8052160) at ir.cpp:35
35      shore::builtin__int * __return = NULL;
(gdb) bt
#0  shore::builtin__int::__mul__ (this=0x8052160, other=0x8052288) at /home/alex/projects/shore/shore/runtime/int.h:36
#1  0x08048d09 in factorial (n=0x8052160) at ir.cpp:35
#2  0x08048cfa in factorial (n=0x80520b8) at ir.cpp:35
#3  0x08048cfa in factorial (n=0x8052018) at ir.cpp:35
#4  0x08048d6f in main () at ir.cpp:43

Specifically it appears that declaring the type of return is somehow triggering the __mul method on builtin__int to be called, and I have no idea why. builtin__int looks like:

#ifndef _SHORE_INT_H
#define _SHORE_INT_H

#include "gc.h"


namespace shore {
    class builtin__int : public shore::Object {
        public:
            // Some day this will be arbitrary percision, but not today.
            long long value;

            static builtin__int* new_instance(long long value_) {
                builtin__int* val = new builtin__int(value_);
                shore::GC::register_object(val);
                return val;
            }

            builtin__int(long long value_) {
                this->value = value_;
            }

            builtin__bool* __eq__(builtin__int* other) {
                return builtin__bool::new_instance(this->value == other->value);
            }

            builtin__int* __add__(builtin__int* other) {
                return builtin__int::new_instance(this->value + other->value);
            }

            builtin__int* __sub__(builtin__int* other) {
                return builtin__int::new_instance(this->value - other->value);
            }

            builtin__int* __mul__(builtin__int* other) {
                return builtin__int::new_instance(this->value * other->value);
            }
    };
}
#endif

Any ideas as to what on earth is compelling C++ to call the mul method?

EDIT: Added the source of ir.cpp

#include "builtins.h"
#include "frame.h"
#include "object.h"
#include "state.h"
std::vector < shore::Frame * >shore::State::frames;
shore::GCSet shore::GC::allocated_objects;
class
    factorial__frame:
    public
    shore::Frame {
  public:
    shore::builtin__int *
    n;
    shore::GCSet
    __get_sub_objects() {
    shore::GCSet s;
    s.
    insert(this->n);
    return
        s;
}};
class
    main__frame:
    public
    shore::Frame {
  public:
    shore::GCSet
    __get_sub_objects() {
    shore::GCSet s;
    return
        s;
}};
shore::builtin__int * factorial(shore::builtin__int * n)
{
    shore::builtin__int * __return = NULL;
    factorial__frame frame;
    shore::State::frames.push_back(&frame);
    frame.n = NULL;
    frame.n = n;
    if (((frame.n)->__eq__(shore::builtin__int::new_instance(0)))->value) {
    __return = shore::builtin__int::new_instance(1);
    shore::GC::collect();
    shore::State::frames.pop_back();
    return __return;
    }
    __return =
    (frame.n)->
    __mul__(factorial
     ((frame.n)->
      __sub__(shore::builtin__int::new_instance(1))));
    shore::GC::collect();
    shore::State::frames.pop_back();
    return __return;
}
int
main()
{
    main__frame     frame;
    shore::State::frames.push_back(&frame);
    builtin__print(factorial(shore::builtin__int::new_instance(3)));
    shore::State::frames.pop_back();
}
+5  A: 

Identifier names with double underscores are reserved. You could be colliding with a compiler-generated name.

Fred Larson
Does that apply to just local vars, or members on classes as well?
Alex Gaynor
@lazypython: That applies to all identifiers.
Georg Fritzsche
All identifiers. locals, globals, functions, classes, class members, macros, and anything else I've forgotten.
Steve Jessop
And now for the really stupid question :). Does this apply to items with 2 or more leading underscores, or exactly 2 underscores. So is ___return (3 underscores) valid?
Alex Gaynor
2 or more leading underscores is a superset of double underscores, so they're off the table too I'm afraid.
fbrereto
No. All names containing two contiguous underscores, regardless of where the underscores are found in the identifier, are reserved.
James McNellis
namespaces! I knew I was leaving stuff out.
Steve Jessop
"___return" has two underscores in a row, so no, it's not valid. Also note that it's not just leading underscores. Double Underscores anywhere in the identifier is not allowed.
Tim
I was just searching for any mentions of `builtin__int` and/or `__mul__` in gcc, and Google just sent me back to this question. I haven't decided whether to be impressed or not.
Steve Jessop
The pretty definitive SO answer about reserved names in C/C++: http://stackoverflow.com/questions/228783/what-are-the-rules-about-using-an-underscore-in-a-c-identifier/228797#228797
Michael Burr
It might be interesting to see what the preprocessed output is (use '-E' option for GCC).
Michael Burr
+1  A: 

Looks to me like the translator is failing, and somehow gcc (or whatever) is thinks that shore::builtin__int is a value of some kind, and you are trying to multiply it by __return, instead of declare the value __return as type shore::builtin__int *...

Obviously, if this thing is compiling at all, and giving you run-time errors, then whatever type the multiplication would give you is a valid LHS...

Brian Postow
+5  A: 

A bit of a guess: the initialization in the line shore::builtin__int * __return = NULL; does nothing, since it's always overwritten. The compiler would be perfectly entitled to (a) reorder it down to where __return is assigned, by the statement that does call __mul__ and then (b) remove the code entirely. But maybe it's left the source line in the debugging info, and either the linker or gdb has ended up thinking the call instruction belongs to the wrong one of the several source lines in the vicinity.

Never trust source debugging unless you can see the disassembly too. Compiled languages - bah, humbug. And so forth.

Steve Jessop
Yep, it turns out the compiler was reordering it and for some reason gdb was bungling the line numbers. It turns out the error was coming from the place where factorial was called (and therefore real multiplication was happening), the root cause of my original bug was the GC was deallocing return values :(
Alex Gaynor
@Steve Jessop: nice catch!
Michael Burr
Thanks :-) I noticed that the same source line was (according to the backtrace) also responsible for the recursive call back into factorial itself, so once lazypython added the source for ir.cpp I knew what I was looking for. Comes of time spent watching the pseudo-random walk you get if you compile for debug with optimisation and then single-step.
Steve Jessop