I would love to write a book or books on the subject, too bad I have not yet...
I vote for the just do it approach. the arduino, armmite pro, olimex, etc boards can be had for less than 50 dollars. qemu and one I wrote thumbulator, can be had for free, these are instruction/system simulators, as are MANY other instruction set simulators out there as well. The compilers are free, and gobs of websites including stack overflow with gobs of how to get started information.
Things like debugging tricks have a lot to do with the type of code you write, what style what level (true embedded, embedded linux, device driver, application, scripty python like high high level, etc). it is not possible to explain everything you might encounter and then explain the debug tricks used for each of those situations. Listening to people tell "war stories" about their adventures are as good as any method for learning different ways to think about the problem as well as how to tell your own war stories to share with others.
My best piece of advice, which many disagree with, is what I call "drive down the middle of the road". Avoid fun compiler or language gee whiz features. Write code that is extremely portable, even if you know it can never be ported. Do this as a habit and your code will stand a better chance at being compiled correctly even with marginal compilers (if you go into the embedded world you are going to have to deal with marginal compilers, of course gcc is marginal and it is out there in the open). Or think of it this way, the number one rule for a clean house or work environment is "dont make the mess in the first place". For example when cooking you hopefully dont carry the saucepan and a bowl into the living room and pour the contents into the bowl while standing on the carpet, nor do it over the stove or a counter, if you do it over the sink you make fewer messes to have to clean up, dont make the mess in the first place. Same goes with programming, dont fiddle with or near the trap door, sometimes you will fall in. Avoid bitfields at all costs for example, and never use structs across compile domains. The biggest debugging trick is to not make avoidable bugs, lets put it that way.
The second most useful advice I can give is start compiling stuff for different architectures (llvm is easier to use than gcc, but both can do it), and disassemble that output, you dont have to write complete programs, single functions will do. This is where you learn your resource limited programming, architectural differences, etc. Depending on how low level you need to go in this venture, you may have to debug at the instruction by instruction level. You may think you know what you told the compiler to do but it did something else that you wont find until you look at the code in this way. You may be shocked to find resource consumption for globals vs locals and compare that to what you were taught about globals and locals. Here again drive down the middle of the road, if your programs use add and and and or and xor operations, if equal, if less than, etc, avoid division at all costs, and limit the use of multiplication, every processor does those things (add, and, or, xor, subtract, ones complement inversion, bit shifts) and your program will be architecturally independent, automatically. No book is going to explain this you have to touch it and feel it yourself by taking the same lines of C code and compiling them in different ways for different target architecures and inspecting the disassembly.
Third bit of advice, avoid using debuggers, single step, breakpoints, that kind of debugger, at all costs. Hide more bugs than they find and do not improve the overall product.
An author by the name of Michael Abrash probably had the best influence on me for how to think in the ways you are looking to think. Starting with a book called the zen of assembly language, which I have in print but was hard to get even then, was re-printed as an appendix to the the big black book of graphics programming. I think you can find a lot of his work online for free. Zen of assembly for example works through the 8088 and 8086 which are now antiques and the specific problems being solved are no longer problems, but it is the thought process that is more important not the solutions to specific problems.