tags:

views:

162

answers:

3

Hi all,

I am developing a raid system and I am getting following BUG when I issue lots of parallel IO. I am not able to figure out how to go about debugging this dump. It's evident here that RAID_LAZY_WRITE thread is trying to schedule while it has locked the CPU but I am not able to trace it exactly how and why it's happening. As I don't acquire any lock to issue IO to block layer.

I enabled spin lock debugging and lock dependency checking so that I could get detailed debug info. Here is the whole dump which I got, please provide me tips, ideas on how to analyze and debug using this dump.

BUG: scheduling while atomic: RAID_LAZY_WRITE/102/0x00000205
no locks held by RAID_LAZY_WRITE/102.

Showing all locks held in the system:

=============================================

Pid: 102, comm:      RAID_LAZY_WRITE
CPU: 0    Not tainted  (2.6.24 #274)
PC is at add_preempt_count+0xa0/0xbc
LR is at raid_write_stripe+0x388/0x544
pc : [<c004b014>]    lr : [<c015a83c>]    psr: 80000013
sp : cfc63ec4  ip : cfc63ed4  fp : cfc63ed0
r10: cf5ce118  r9 : 00000001  r8 : 000000b4
r7 : 00000001  r6 : cf403cd8  r5 : cfc62000  r4 : 00000002
r3 : 00000001  r2 : 00000001  r1 : cfc62000  r0 : 00000001
Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: 0000397f  Table: 0f558000  DAC: 00000017
[<c0039fa0>] (show_regs+0x0/0x4c) from [<c004ae14>] (__schedule_bug+0x54/0x68)
 r4:cfc9b0a0
[<c004adc0>] (__schedule_bug+0x0/0x68) from [<c029747c>] (schedule+0x74/0x2f4)
 r5:cfc9b0a0 r4:c034d13c
[<c0297408>] (schedule+0x0/0x2f4) from [<c029772c>] (io_schedule+0x30/0x54)
[<c02976fc>] (io_schedule+0x0/0x54) from [<c012e214>]
(get_request_wait+0xac/0x140)
 r4:00000000
[<c012e168>] (get_request_wait+0x0/0x140) from [<c012e864>]
(__make_request+0x478/0x544)
[<c012e3ec>] (__make_request+0x0/0x544) from [<c012c3d0>]
(generic_make_request+0x2a8/0x2f0)
[<c012c128>] (generic_make_request+0x0/0x2f0) from [<c0168858>]
(raid_submit_disk_io+0x348/0x37c)
[<c0168510>] (raid_submit_disk_io+0x0/0x37c) from [<c016be24>]
(perform_raid5_ops+0xdc8/0x1100)
[<c016b05c>] (perform_raid5_ops+0x0/0x1100) from [<c016c594>]
(raid5_RCW_postXOR+0x438/0x4c4)
[<c016c15c>] (raid5_RCW_postXOR+0x0/0x4c4) from [<c016cb8c>]
(raid5_xor_completed+0x2a4/0x3b0)
[<c016c8e8>] (raid5_xor_completed+0x0/0x3b0) from [<c01fb3a4>]
(iop_adma_run_tx_complete_actions+0x64/0xe8)
[<c01fb340>] (iop_adma_run_tx_complete_actions+0x0/0xe8) from
[<c01fb720>] (__iop_adma_slot_cleanup+0x2f8/0x3c0)
 r8:c0352f24 r7:00000000 r6:00000000 r5:00000000 r4:cf4faa7c
[<c01fb428>] (__iop_adma_slot_cleanup+0x0/0x3c0) from [<c01fb7f8>]
(iop_adma_tasklet+0x10/0x14)
[<c01fb7e8>] (iop_adma_tasklet+0x0/0x14) from [<c0057204>]
(run_timer_softirq+0x17c/0x208)
[<c0057088>] (run_timer_softirq+0x0/0x208) from [<c0053268>]
(__do_softirq+0x64/0xd0)
[<c0053204>] (__do_softirq+0x0/0xd0) from [<c005331c>] (irq_exit+0x48/0x5c)
[<c00532d4>] (irq_exit+0x0/0x5c) from [<c0038048>]
(__exception_text_start+0x48/0x60)
[<c0038000>] (__exception_text_start+0x0/0x60) from [<c0038ab0>]
(__irq_svc+0x50/0x7c)
Exception stack(0xcfc63e7c to 0xcfc63ec4)
3e60:                                                                00000001
3e80: cfc62000 00000001 00000001 00000002 cfc62000 cf403cd8 00000001 000000b4
3ea0: 00000001 cf5ce118 cfc63ed0 cfc63ed4 cfc63ec4 c015a83c c004b014 80000013
3ec0: ffffffff
 r6:00000020 r5:cfc63eb0 r4:ffffffff
[<c004af74>] (add_preempt_count+0x0/0xbc) from [<c015a83c>]
(raid_write_stripe+0x388/0x544)
[<c015a4b4>] (raid_write_stripe+0x0/0x544) from [<c015fc44>]
(raid_lazy_writer+0x4c4/0x6b8)
[<c015f780>] (raid_lazy_writer+0x0/0x6b8) from [<c0159528>]
(raid_thread_daemon+0x140/0x15c)
[<c01593e8>] (raid_thread_daemon+0x0/0x15c) from [<c0062c78>]
(kthread+0x58/0x90)
[<c0062c20>] (kthread+0x0/0x90) from [<c0051284>] (do_exit+0x0/0x764)
 r6:00000000 r5:00000000 r4:00000000


BUG: spinlock cpu recursion on CPU#0, RAID_DAEMON/101
 lock: cf4faae0, .magic: dead4ead, .owner: RAID_LAZY_WRITE/102, .owner_cpu: 0
[<c032a9d0>] (dump_stack+0x0/0x14) from [<c0141c90>] (spin_bug+0x90/0xa4)
[<c0141c00>] (spin_bug+0x0/0xa4) from [<c0141e08>] (_raw_spin_lock+0x68/0x15c)
 r5:c01fb378 r4:cf4faae0
[<c0141da0>] (_raw_spin_lock+0x0/0x15c) from [<c0298ec8>]
(_spin_lock_bh+0x4c/0x54)
[<c0298e7c>] (_spin_lock_bh+0x0/0x54) from [<c01fb378>]
(iop_adma_run_tx_complete_actions+0x38/0xe8)
 r5:00000000 r4:cf4faa7c
[<c01fb340>] (iop_adma_run_tx_complete_actions+0x0/0xe8) from
[<c01fb720>] (__iop_adma_slot_cleanup+0x2f8/0x3c0)
 r8:00000001 r7:00000000 r6:00000000 r5:00000000 r4:cf4faa7c
[<c01fb428>] (__iop_adma_slot_cleanup+0x0/0x3c0) from [<c01fb7f8>]
(iop_adma_tasklet+0x10/0x14)
[<c01fb7e8>] (iop_adma_tasklet+0x0/0x14) from [<c0052e30>]
(tasklet_action+0x8c/0xe4)
[<c0052da4>] (tasklet_action+0x0/0xe4) from [<c0053268>]
(__do_softirq+0x64/0xd0)
 r6:0000000a r5:00000001 r4:c0351f44
[<c0053204>] (__do_softirq+0x0/0xd0) from [<c005331c>] (irq_exit+0x48/0x5c)
[<c00532d4>] (irq_exit+0x0/0x5c) from [<c0038048>]
(__exception_text_start+0x48/0x60)
[<c0038000>] (__exception_text_start+0x0/0x60) from [<c0038ab0>]
(__irq_svc+0x50/0x7c)
Exception stack(0xcfcb1e64 to 0xcfcb1eac)
1e60:          cf652404 00000001 ffffffff 00000001 cf652404 40000013 c0417bec
1e80: 7fffffff cea81d88 00000000 cf6523e0 cfcb1ec0 cfcb1eac cfcb1eac c0298d00
1ea0: c0298d04 60000013 ffffffff
 r6:00000064 r5:cfcb1e98 r4:ffffffff
[<c0298cd4>] (_spin_unlock_irqrestore+0x0/0x54) from [<c015c344>]
(raid_read_stripe+0x204/0x560)
 r5:00000008 r4:cf652404
[<c015c140>] (raid_read_stripe+0x0/0x560) from [<c01602bc>]
(raid_daemon+0x1d4/0x1110)
[<c01600e8>] (raid_daemon+0x0/0x1110) from [<c0159528>]
(raid_thread_daemon+0x140/0x15c)
[<c01593e8>] (raid_thread_daemon+0x0/0x15c) from [<c0062c78>]
(kthread+0x58/0x90)
[<c0062c20>] (kthread+0x0/0x90) from [<c0051284>] (do_exit+0x0/0x764)
 r6:00000000 r5:00000000 r4:00000000

Thank you.

Regards, Vikash Kumar http://VikashKumar.com/

A: 

The issue seems to be the invocation of block layer API (generic_make_request) from the softirq context which is actually the callback of another IO or a XOR operation. Apparently, the callbacks are being invoked in timer context, as it's seen from the call stack, which is an interrupt and we cannot sleep in interrupt context, hence the bug.

However this needs to be validated yet and I will update this thread as soon as I do that. Till then I welcome you to provide your comments on this.

I have validated this and calling block IO api(generic_make_request) from a soft irq thread (read callback) is a bad idea. That results in Scheduling while atomic BUG. I have offloaded that task to another thread now and that works absolutely fine.

Thank you.

Regards, Vikash

Vikash
This, indeed, is the problem. I have verified it. If I off load the task of issuing IO to a separate thread instead of doing it from the XOR callback then the schedule bug goes away.
Vikash
A: 

hi,

my problem is different then yours, but I got "Bug : scheduling while atomic". I spent two days debugging my code, and finally I found (not sure 100%, but almost sure) that the problem happens when I call "kmalloc(sizeof(struct mystruct), GFP_KERNEL)" inside a schedeled function. I decided to post these comments since it was not obvious to me to think about deleting kmalloc to resolve my problem.

Now I should think about another way to allocate memory inside the schedeled function without invoking the bug again!

I welcome you too, to provide your comments on this, and to provide me a malloc function which doesn't make the kernel suffer!

ahhhhhhh! it's hard to play with the kernel!!!

babanjida
A: 

I resolved my problem by calling "kmalloc(sizeof(struct mystruct), GFP_ATOMIC)" instead of "kmalloc(sizeof(struct mystruct), GFP_KERNEL)". This is a good article about atomicity : http://kerneltrap.org/node/435

I'm sorry if my comments are not helping you, Vikash, but I try to help since the bug of "schedeling while atomic" disturbed me a lot.

Regards, babanjida

babanjida
Thanks for your comments and yes kmalloc can sleep for allocating memory if you haven't passed the GFP_ATOMIC flag to it and My problem was due making block IO calls from soft irq threads (read callbacks). So, my problem is fixed now.
Vikash