views:

83

answers:

1

I'm working with what should be a fairly basic iteration. I understand that I could accomplish it with Ruby code, but I am working already in a C extension, so I would prefer to keep this function in C with the rest of the code- especially since this should work (one way or another) without issue.

The issue is with rb_block_call. Here is how README.EXT describes rb_block_call:

VALUE rb_block_call(VALUE recv, ID mid, int argc, VALUE * argv,
          VALUE (*func) (ANYARGS), VALUE data2)

Calls a method on the recv, with the method name specified by the symbol mid, supplying func as the block. func will receive the value from yield as the first argument, data2 as the second, and argc/argv as the third/fourth arguments.

So, my understanding (verified by looking at Ruby internals), is that the receiving function should look like:

VALUE function( VALUE rb_yield_value, VALUE data2, int argc, VALUE argv );

And here we hit our problem. In my use case (which I will include below), rb_yield_value and data2 are passed as expected; argc, on the other hand, is always set to 1, argv[ 0 ] is rb_yield_value, argv[ 1 ] is false, argv[ 2 ] is rb_yield_value, argv[ 3 ] throws an exception.

It does not matter what I pass for argc and argv; passing 0 and NULL results the same, as does 1 and a VALUE set to Qtrue. Everything with argc/argv remains as described.

Here is the code I am working with:

VALUE rb_RPBDB_DatabaseObject_internal_cursorForCallingContext( VALUE rb_self ) {

    //  when we are looking for the contextual iterator, we look up the current backtrace
    //  at each level of the backtrace we have an object and a method;
    //  if this object and method match keys present in self (tracking calling contexts for iteration in this iteration class) return cursor

    VALUE   rb_cursor_context_storage_hash  =   rb_RPBDB_DatabaseObject_internal_cursorContextStorageHash( rb_self );

    VALUE   rb_cursor   =   Qnil;

    if ( RHASH_SIZE( rb_cursor_context_storage_hash ) ) {

        rb_block_call(  rb_mKernel, 
                        rb_intern( "each_backtrace_frame" ), 
                        1, 
                        & rb_cursor_context_storage_hash, 
                        rb_RPBDB_DatabaseObject_internal_each_backtrace_frame, 
                        rb_cursor );    
    }

    return rb_cursor;
}

//  walk up the stack one frame at a time
//  for each frame we need to see if object/method are defined in our context storage hash
VALUE rb_RPBDB_DatabaseObject_internal_each_backtrace_frame(    VALUE   rb_this_backtrace_frame_hash, 
                                                                VALUE   rb_cursor_return,
                                                                int     argc,
                                                                VALUE*  args )  {

    //  why are we getting 3 args when argc is 1 and none of the 3 match what was passed?
    VALUE   rb_cursor_context_storage_hash  =   args[ 0 ];

    //  each frame is identifiable as object/method
    VALUE   rb_this_frame_object    =   rb_hash_aref(   rb_this_backtrace_frame_hash,
                                                        ID2SYM( rb_intern( "object" ) ) );
    VALUE   rb_this_frame_method    =   rb_hash_aref(   rb_this_backtrace_frame_hash,
                                                        ID2SYM( rb_intern( "method" ) ) );

    //  we likely have "block in ..." for our method; we only want the "..."
    rb_this_frame_method    =   ID2SYM( rb_to_id( rb_funcall(   rb_obj_as_string( rb_this_frame_method ),
                                                                rb_intern( "gsub" ),
                                                                2,
                                                                rb_str_new2( "block in " ),
                                                                rb_str_new2( "" ) ) ) );

    VALUE   rb_cursor_object_context_hash   =   rb_RPBDB_DatabaseObject_internal_cursorObjectContextStorageHash(    rb_cursor_context_storage_hash,
                                                                                                                    rb_this_frame_object);

    if ( RHASH_SIZE( rb_cursor_object_context_hash ) )  {

        rb_cursor_return    =   rb_hash_aref(   rb_cursor_object_context_hash,
                                                rb_this_frame_method );

    }

    return rb_cursor_return;
}

Ruby internals don't seem to have many examples of rb_block_call with argc/argv... At most one or two, and I believe they all simply relay the values internally rather than using them.

Thoughts?

+1  A: 

I am pretty new to Ruby C extension, but I think where your confusion is.

VALUE rb_block_call(VALUE recv, ID mid, int argc, VALUE argv[],
    VALUE (*func) (ANYARGS), VALUE data2)

argc/argv are here the arguments to the Ruby function you call.

In the C-function called as a block:

VALUE block_function(VALUE rb_yield_value, VALUE data2, int argc, VALUE argv[])

argc/argv are the arguments of the block.

A simple example is inject

Here is the C translation of: [1,2,3].inject { |sum, e| sum + e }

#include "ruby.h"

static VALUE rb_puts(VALUE obj) {
  return rb_funcall(rb_mKernel, rb_intern("puts"), 1, obj);
}

static VALUE inject_block(VALUE yield_value, VALUE data2, int argc, VALUE argv[]) {
  printf("\nyield_value:\n");
  rb_puts(yield_value);
  printf("data2:\n");
  rb_puts(data2);
  printf("argc: %d\n", argc);
  printf("argv:\n");
  int i;
  for(i = 0; i < argc; ++i) {
    printf("argv %d:\n", i);
    rb_puts(argv[i]);
  }

  VALUE sum = argv[0];
  VALUE e = argv[1];// or yield_value
  return INT2FIX(FIX2INT(sum) + FIX2INT(e));
}

static VALUE rb_block_call_test(int argc, VALUE argv[]) {
  VALUE ary = rb_ary_new();
  int i;
  for(i = 0; i < 3; ++i) {
    rb_ary_push(ary, INT2FIX(i+1));
  }
  VALUE block_argv[1];
  block_argv[0] = INT2FIX(0);
  ary = rb_block_call(ary,
                rb_intern("inject"),
                1, // argc
                block_argv, //argv is a C-array of VALUE
                inject_block,
                Qtrue // data2
                );
  return ary;
}

void Init_rb_block_call() {
  rb_define_global_function("rb_block_call_test", rb_block_call_test, 0);
}

which outputs (of a call to rb_block_call_test):

yield_value: 0 # sum = argv[0]
data2: true
argc: 2
argv:
argv 0: 0 # sum
argv 1: 1 # e

yield_value: 1
data2: true
argc: 2
argv:
argv 0: 1
argv 1: 2

yield_value: 3
data2: true
argc: 2
argv:
argv 0: 3
argv 1: 3

# => 6

I believe yield_value is always argv[0]

If you want to pass information between the block and the caller, then use data2

In your example, I suppose #each_backtrace_frame is yielding one "backtrace_frame" and so that is the reason argc/argv of the block is always 1/the_backtrace_frame. I believe #each_backtrace_frame accepts any number of arguments since it did not raise any error when you tried to pass some.

eregon
OK this actually makes sense upon review. This is decidedly a documentation bug, as the documentation says that argc/argv will be passed to func (which is a parameter) and not to the Ruby method specified by mid. The documentation should read: Calls a method on the recv, with the method name specified by thesymbol mid, with argc arguments in argv, supplying func as the block. When func is called as the block, it will receive thevalue from yield as the first argument, and data2 as the second.
Asher
... and argc/argv as third/four arguments, being the yielded values of the block
eregon
That argc/argv is internal and does not make sense to reference when we are addressing the API. The documentation for the API is incorrect in its wording. The argc/argv referenced by the documentation refers to rb_block_call, not to the internal rb_yield. The documentation is confusing at best, but in my opinion quite clearly and simply wrong. It should be updated.
Asher