This comp.compilers item has a specific recommendation for computing modulo 3.
An alternative, especially if the maximium size of the dividend is modest, is to multiply by the reciprocal of 3 as a fixed-point value, with enough bits of precision to handle the maximum size dividend to compute the quotient, and then subtract 3*quotient from the the dividend to get the remainder. All of these multiplies can be implemented with a fixed sequence of shifts-and-adds. The number of instructions will depend on the bit pattern of the reciprocal. This works pretty well when the dividend max is modest in size.
Regarding adding digits in the number... if you want to add the decimal digits, you're going to end up doing what amounts to a number-conversion-to-decimal, which involves divide by 10 somewhere. If you're willing to settle for adding up the digits in base2, you can do this with an easy shift-right and add loop. Various clever tricks can be used to do this in chunks of N bits to speed it up further.