If you are doing embedded programming then where is the encryption concern?
As long as you are staying on the same processor, for example, encryption is useless.
If you are concerned between processors on the device then encryption can be difficult, as a dsp won't necessarily have the spare space for any encryption of any complexity.
If you want fast then a symmetric algorithm is your best bet, and there are many that you can do that are good, such as Blowfish or IDEA. You can store a unique symmetric key, so if they can take the device apart, only one key will be discovered, but each device should have it's own key.
Just tie each key to the serial number, so that if you are communicating with a server then pass the serial number with the packet and the webserver can look up the correct symmetric key and decrypt it quickly.
If you wanted to do a fast hardware encryption, for my MSEE thesis I developed an encryption family that uses arbitrary order calculus, which would be difficult to determine the key as it can be in the hardware circuit using microstrip, as there is no processing, it would be tied in directly before the antenna and everything going over it would be encrypted.