A power of 2.
For a simple implementation, probably half the size of a word on your machine, so that you can multiply two digits without overflow. So 65536 or 4294967296. Or possibly half the size of the largest integer type, for the same reason but maybe better performance over all.
But I've never actually implemented such a library: if you're using best known algorithms then you won't be doing school-style long multiplication. Karatsuba multiplication (and whatever other clever tricks you use) might benefit from being done in an integer that's more than twice the size of the digits, I really don't know how the performance works out. If so, then you'd be best off using 256 and 32 bit arithmetic, or 65536 and 64 bit arithmetic.
In any case if your representation is binary, then you can pick and choose larger power-of-two bases as convenient for each operation. For instance, you could treat the data as base 2^16 for multiplication, but base 2^32 for addition. It's all the same thing provided you're careful about endian-ness. I'd probably start with base 2^16 (since that forces me to get the endian-ness right to begin with, while 2^8 wouldn't), and see how I get on - as each operation is optimised, part of the optimisation is to identify the best base.
Using a size which isn't a multiple of bytes is a possibility, but then you have to use the same base for everything, because there are unused bits in the storage in specific places according to the base.