views:

491

answers:

7

First and foremost, apologies for any cross-posting. Hope I'm not repeating an issue here, but I was unable to find this elsewhere (via Google and Stack Overflow).

Here's the gist of the error. If I call printf, sprintf or fprintf anywhere within my code, to display a float, I get a SIGSEGV (EXC_BAD_ACCESS) error. Let me give an example.

The following throws the error:

float f = 0.5f;
printf("%f\n",f);

This code does not:

float f = 0.5f;
printf("%d\n",f);

I realize there's an implicit conversion there, but I'm not concerned with that. I just can't fathom why printing a float vs. printing an integer would throw an error.

Note: Part of the code uses malloc to create some very large multidimensional arrays. However, these arrays are not being referenced in any way for these print statements. Here's an example how I'm declaring these arrays.

#define X_LEN 20
#define XDOT_LEN 20
#define THETA_LEN 20
#define THETADOT_LEN 20
#define NUM_STATES (X_LEN+1) * (XDOT_LEN+1) * (THETA_LEN+1) * (THETADOT_LEN+1)
#define NUM_ACTS 100

float *states = (float *)malloc(NUM_STATES * sizeof(float));
// as opposed to float states[NUM_STATES] (more memory effecient)


float **q = (float**)malloc(NUM_STATES * sizeof(float*));

for(int i=0; i < NUM_STATES; i++) {
    float *a = (float*)malloc(NUM_ACTS * sizeof(float));
    for(int j=0; j < NUM_ACTS; j++) {
        a[j] = 0.0f;
    }
    q[i] = a;
}

And then the above printf statements occur later in the code.

The reason I included the malloc stuff is because from what I understand, SIGSEGV is related to poorly formed malloc calls. So, if the array initializations are what's causing the problem, I would like to know:

  • why?
  • how can I change the malloc code to solve this problem?

I've included the crash log generated by OS X, just in case that helps anybody out.

Process:         pole [5453]
Path:            {REDACTED}
Identifier:      pole
Version:         ??? (???)
Code Type:       X86-64 (Native)
Parent Process:  bash [5441]

Date/Time:       2009-12-08 11:38:38.358 -0600
OS Version:      Mac OS X 10.6.2 (10C540)
Report Version:  6

Interval Since Last Report:          130074 sec
Crashes Since Last Report:           68
Per-App Crashes Since Last Report:   63
Anonymous UUID:                      CA20CF15-8C46-4C85-A793-6C69F9F40140

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x0000000100074f3b
Crashed Thread:  0  Dispatch queue: com.apple.main-thread

Thread 0 Crashed:  Dispatch queue: com.apple.main-thread
0   libSystem.B.dylib               0x00007fff828d489e __Balloc_D2A + 164
1   libSystem.B.dylib               0x00007fff828d49b8 __d2b_D2A + 45
2   libSystem.B.dylib               0x00007fff828e8c74 __dtoa + 320
3   libSystem.B.dylib               0x00007fff828aa960 __vfprintf + 4980
4   libSystem.B.dylib               0x00007fff828ec7db vfprintf_l + 111
5   libSystem.B.dylib               0x00007fff828ec75e fprintf + 196
6   pole                            0x00000001000028b5 Balance::sarsa() + 187
7   pole                            0x0000000100002e54 main + 49
8   pole                            0x00000001000010a8 start + 52

Thread 0 crashed with X86 Thread State (64-bit):
  rax: 0x0000000000000001  rbx: 0x000000010042cca0  rcx: 0x000000010042cca8  rdx: 0x0000000100074f3b
  rdi: 0x000000000000000e  rsi: 0x00007fff5fbfecbc  rbp: 0x00007fff5fbfeba0  rsp: 0x00007fff5fbfeb90
   r8: 0x00007fff5fbff0b0   r9: 0x0000000000000000  r10: 0x00000000ffffffff  r11: 0x000000010083a40b
  r12: 0x0000000000000001  r13: 0x00007fff5fbfecb8  r14: 0x00007fff5fbfecbc  r15: 0x000000010000363e
  rip: 0x00007fff828d489e  rfl: 0x0000000000010202  cr2: 0x0000000100074f3b

Binary Images:
       0x100000000 -        0x100003fff +pole ??? (???)  {REDACTED}
    0x7fff5fc00000 -     0x7fff5fc3bdef  dyld 132.1 (???)  /usr/lib/dyld
    0x7fff81697000 -     0x7fff8169bff7  libmathCommon.A.dylib ??? (???)  /usr/lib/system/libmathCommon.A.dylib
    0x7fff8289c000 -     0x7fff82a5aff7  libSystem.B.dylib ??? (???)  /usr/lib/libSystem.B.dylib
    0x7fff83c4c000 -     0x7fff83cc9fef  libstdc++.6.dylib ??? (???)  /usr/lib/libstdc++.6.dylib
    0x7fffffe00000 -     0x7fffffe01fff  libSystem.B.dylib ??? (???)  /usr/lib/libSystem.B.dylib

Model: MacBookPro4,1, BootROM MBP41.00C1.B03, 2 processors, Intel Core 2 Duo, 2.4 GHz, 2 GB, SMC 1.27f2
Graphics: NVIDIA GeForce 8600M GT, GeForce 8600M GT, PCIe, 256 MB
Memory Module: global_name
AirPort: spairport_wireless_card_type_airport_extreme (0x14E4, 0x8C), Broadcom BCM43xx 1.0 (5.10.91.19)
Bluetooth: Version 2.2.4f3, 2 service, 1 devices, 1 incoming serial ports
Network Service: AirPort, AirPort, en1
Serial ATA Device: Hitachi HTS542520K9SA00, 186.31 GB
Parallel ATA Device: MATSHITADVD-R   UJ-867
USB Device: Built-in iSight, 0x05ac  (Apple Inc.), 0x8502, 0xfd400000
USB Device: Apple Internal Keyboard / Trackpad, 0x05ac  (Apple Inc.), 0x0230, 0x5d200000
USB Device: IR Receiver, 0x05ac  (Apple Inc.), 0x8242, 0x5d100000
USB Device: BRCM2046 Hub, 0x0a5c  (Broadcom Corp.), 0x4500, 0x1a100000
USB Device: Bluetooth USB Host Controller, 0x05ac  (Apple Inc.), 0x820f, 0x1a110000

Thanks.

+1  A: 

it is possible that you have a pointer arithmetic error or a buffer overflow which has the side effect of breaking the printf.

Try commenting out the majority of the code (except the printf) and see if it crashes. If it doesn't, then little by little un-\comment parts until you get the crash back. The you'll know where the problem is.

Also, if you are using linux or any unix variant, look into using valgrind as well.

EDIT:

I see this in your error report:

0   libSystem.B.dylib               0x00007fff828d489e __Balloc_D2A + 164

That's where the actual crash is, which appears to be a low level allocation routine. I would guess that you have a buffer overflow which is corrupting the free list, making certain future allocations break (such as in this printf).

Evan Teran
A: 

Possibility number one is that you are just plain malloc()'ing enough memory for your arrays that the printf() attempts to malloc() a little more,and fails. I think this is highly unlikely.

Possibility number two is that your printf() is not as simple as you are showing, but rather is some rather complicated multilevel pointer expression, and that expression is going wild somewhere.

John R. Strohm
+1  A: 

SIGSEGV occurs either when you access a virtual address that is not mapped to anything or when you access an address in a way which is not allowed (e.g., trying to write to a read-only area). As you say, segmentation faults can be related to heap corruption. This is because internally, most malloc implementations interleave bookkeeping information with allocated data on the heap. If that bookkeeping information gets corrupted, malloc's behavior is undefined. You may not see any errors until much later in the program.

In this case, printf is probably allocating some memory internally, which is triggering the fault. Probably the best way to fix this is to run your program with valgrind, which will notify you of any heap corruption as soon as it happens.

Jay Conrod
A: 

Some implementations of printf do a mallocs when processing "%f"s. If it IS doing this then if you have overflowed memory at some point (ie written past the end of the allocation) the printf can then try and make an allocation and find the heap corrupted and throw an error ...

just a thought.

Edit: Its probably worth looking at how your state array is filled ... the other 2 appear fine but you could be writing past the end anywhere ...

Goz
A: 

printf("%f", parm) expects the parameter to be a double. Your f is a float which gets implicitly converted to a double.

Maybe the implicit conversion is messed up ???

Try an explicit conversion

float f = 0.5f;
printf("%f\n",(double)f);

or even

float f = 0.5f;
double ff = f;
printf("%f\n",ff);
pmg
+3  A: 

You have a bug elsewhere in your code not related to the printf statement. You're stomping on memory somewhere, but the problem doesn't manifest itself until printf tries to allocate some memory with __BAlloc_D2A, which crashes because the heap data structures it uses to keep track of free memory blocks have been corrupted.

To try to detect where you're stomping on memory, there are a number of tools available. If you were on Linux, I would suggest using valgrind, which essentially runs your code in a virtual machine and tells you whenever you do anything illegal like read/write memory out of bounds, read an uninitialized variable, etc. However, it's not available in Mac OS X (yet).

One option is to use libgmalloc:

% cat gmalloctest.c
#include <stdlib.h>
#include <stdio.h>

main()
{
  unsigned *buffer = (unsigned *)malloc(sizeof(unsigned) * 100);
  unsigned i;

  for (i = 0; i < 200; i++) {
    buffer[i] = i;
  }

  for (i = 0; i < 200; i++) {
     printf ("%d  ", buffer[i]);
  }
}

% cc -g -o gmalloctest gmalloctest.c
% gdb gmalloctest
Reading symbols for shared libraries .. done
(gdb) set env DYLD_INSERT_LIBRARIES /usr/lib/libgmalloc.dylib
(gdb) r
Starting program: gmalloctest
Reading symbols for shared libraries .. done
GuardMalloc: Allocations will be placed on 16 byte boundaries.
GuardMalloc:  - Some buffer overruns may not be noticed.
GuardMalloc:  - Applications using vector instructions (e.g., SSE or Altivec) should work.
GuardMalloc: GuardMalloc version 19

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0xb000d000
0x00001f65 in main () at gmalloctest.c:10
10          buffer[i] = i;
(gdb) print i
$1 = 100
(gdb) where
#0  0x00001f65 in main () at gmalloctest.c:10
(gdb)

See also Enabling the Malloc Debugging Features.

Adam Rosenfield
Actually, as of Valgrind 3.5.0, Mac OS X (10.5.x) *is* supported. BTW, Valgrind *rocks*.
Dan Moulding
Ah, good to know. Maybe I should finally upgrade away from 10.4. And yes, Valgrind does indeed rock hard.
Adam Rosenfield
A: 

Just a guess, but have you done #include <stdlib.h>? Without a prototype for malloc() in scope, the compiler assumes that malloc() returns int, which is obviously not true.

If I am right (even otherwise), it exposes a reason for not casting the return value of malloc() in C. Note that if you are writing code for both C and C++, you will need to, but for pure C, don't cast the return value of malloc(), and let the compiler do the right thing for you.

So, instead of:

T *data = (T *) malloc(sz * sizeof(T));

do this instead:

#include <stdlib.h>
...
T *data = malloc(sz * sizeof *data);

Here, T is any type. The advantages are:

  1. The compiler will complain if you forgot to #include <stdlib.h>,
  2. If you change the type of data, the malloc() call doesn't need to change, and
  3. I think it's easier to read and less error-prone.

By casting the return value of malloc(), you don't give the compiler a chance to warn you about missing inclusion of stdlib.h.

Alok