wrt floating point:
on a G1, adding two floats takes about 400ns. adding two ints takes about 250ns.
on a nexus one running eclair (pre-JIT), both operations take around 120ns. (ints are slightly faster, but you'd have to be microbenchmarking to notice.) there's a small percentage difference between int and long, and float and double, but basically if you can afford one, you can probably afford the other.
other current devices will be somewhere between these extremes. (other operations will differ too. multiplication is more expensive than addition/subtraction, and division is more expensive still. no current devices have hardware integer division.)
but don't obsess about any of this until you have a problem. chances are, your performance problems will be down to a poor choice of algorithm or data structure, just like everyone's performance problems always are.
most of the current (eclair) performance documentation is incorrect. benchmark things yourself, on the device(s) you care about.
but if you were really asking "what should a desktop/server java programmer watch out for?", i'd suggest: unnecessary allocation. you don't have a spare core to do your GC like you do on the desktop/server, and you don't have gigabytes of heap like you do on the desktop/server. if you're doing GC, you're not doing real work, and your heap will be at most 24MiB on current devices. so you might want to avoid unnecessary allocation in inner loops.