views:

889

answers:

9

Fairly easy, if the BigInteger number is 543 I want it to cut off the last digit so that it is 54.

Two easy ways to do this can be :

  1. Use strings, get substring and create new biginteger with the new value.
  2. Use BigIntegers divide method with number 10. ( 543 / 10 = 54.3 => 54 )

The thing is I will be performing this a lot of times with large integers of course.

My guess is that playing around with strings will be slower but then again I haven't used Bigintegers so much and have no idea how expensive the "divide" operation is.

The speed is essential here, what is the fastest way to implement this (memory is no problem only speed) ?

Others solutions are also welcome.

+4  A: 

Divide by 10 is most likely going to be faster.

McWafflestix
I did write a micro-benchmark, and dividing was 4x to 30x times faster depending on the size of number. The longer number is, the higher ratio it is.
notnoop
@msaedd Then i know which one to use then. Thanks
Milan
A: 

The fastest way is dividing the number by 10 with an efficient internal division implementation. The internals of that operation are behind the scenes but certainly non-trivial since the number is stored base-2.

280Z28
A: 

if performance is crucial... don't use java

In languages which compile to machine code (for instance c or c++) the integer divide is quicker by a huge factor. String operations use (or can use) memory allocations and are therefore slow.

My bet is that in java int divisions will be quicker too. Otherwise their vm implementation is really weird.

Toad
Erm, you know that BigInteger is an arbitrarily large integer and therefore won't (and can't) result in a single ALU divide instruction?
Joey
+2  A: 

If you create a BigInteger statically that has the number 10, and then use that to divide by 10, that will be potentially the fastest way to do this. It beats creating a temporary new BigInteger every time.

The problem with substring is that you are essentially creating a new String every single time, and that is much slower, not to mention the slowness that is iterating through a string to get its substring.

AlbertoPL
There is already one: BigInteger.TEN
notnoop
Ok, rephrasing here. Creating a substring is O(1) in Java. You don't have to iterate, as length is known (it's not C) and creating a substring re-uses the char[] the original String uses internally. So you don't get unnecessary allocations from the substring method. Dividing a BigInteger creates a new BigInteger anyway, though, as BigInteger is immutable. So based on your claims alone it's muddy at best which operation is actually faster.
Joey
true. Although correct, the argument isn't valid. The problem is that toString() method is expensive (uses /10 internally), and then parsing the number again is also expensive.
notnoop
@msaeed Ah yes, I forgot about BigInteger.TEN.
AlbertoPL
A: 

The fastest possible implementation would probably be to use a data type whose internal representation uses base 10, i.e. some sort of BCD. Then, division by 10 would simply mean dropping the last byte (or even just incrementing/decrementing an index if you implement it the right way).

Of course, you'd have to implement all arithmetic and other operations you need from scratch, making this a lot of work.

Michael Borgwardt
But the only thing this would make faster is multiplying or dividing by 10. Pretty much everything else would get slower (sometimes by a great deal).
280Z28
Erm, BigDecimal?
starblue
Yes, the whole point is to specifically speed up multiplication and division by 10. And no, BigDecimal does not do this - it uses a BigInteger as an internal representation, which in turn uses an int[]. So it's all based on binary math and lots of additional effort to give the illusion of decimal math, making BigDecimal awfully slow.
Michael Borgwardt
+3  A: 

Dividing by 10 is much faster than using a substring operation. Using the following benchmark, I get about 161x times (ratio is proportional to bit count)

    long divTime = 0;
    long substrTime = 0;
    final int bitsCount = 1000;

    for (int i = 0; i < 1000; ++i) {
        long t1, t2;
        BigInteger random = new BigInteger(bitsCount, new Random());

        t1 = System.currentTimeMillis();
        random.divide(BigInteger.TEN);
        t2 = System.currentTimeMillis();
        divTime += (t2 - t1);

        t1 = System.currentTimeMillis();
        String str = random.toString();
        new BigInteger(str.substring(0, str.length() - 1));
        t2 = System.currentTimeMillis();
        substrTime += (t2 - t1);
    }

    System.out.println("Divide: " + divTime);
    System.out.println("Substr: " + substrTime);
    System.out.println("Ratio:  " + (substrTime / divTime));
notnoop
you should never time like this. You should make 2 separate loops (one with the divide, and one with the string) and time them completely.
Toad
concur. Preferably, I should have split this into two methods, one to time divide, and one for substring, which get run over different samples. You do want to ensure that JVM doesn't optimize one call because of it caching results from other calls.
notnoop
A: 

It's probably premature to even be asking this question. Do it the obvious way (divide by ten), then benchmark it, and optimize it if you need to. Converting to a string representation and back will be much slower.

Mark Bessey
A: 

The toString() alone is probably slower than the substring.

smackfu
A: 

Various people have said that dividing by 10 will be faster than converting to a string and taking the substring. To understand why, just think about the computation involved in converting from a BigInteger to a String, and vice versa. For example:

/* simplified pseudo code for converting +ve numbers to strings */
StringBuffer sb = new StringBuffer(...);
while (number != 0) {
   digit = number % 10;
   sb.append((char)(digit + '0'));
   number = number / 10;
}
return sb.toString();

The important thing to note is that converting from a number to a string entails repeatedly dividing by 10. Indeed the number of divisions is proportional to log10(number). Going in the other direction involves log10(number) multiplications. It should be obvious that this is much more computation than a single division by 10.

Stephen C