views:

568

answers:

2

This should be easy.

here's my array (rather, a method of generating representative test arrays):

>>> ri = numpy.random.randint
>>> ri2 = lambda x: ''.join(ri(0,9,x).astype('S'))
>>> a = array([float(ri2(x)+ '.' + ri2(y)) for x,y in ri(1,10,(10,2))])
>>> a
array([  7.99914000e+01,   2.08000000e+01,   3.94000000e+02,
         4.66100000e+03,   5.00000000e+00,   1.72575100e+03,
         3.91500000e+02,   1.90610000e+04,   1.16247000e+04,
         3.53920000e+02])

I want a list of strings where '\n'.join(list_o_strings) would print:

   79.9914
   20.8
  394.0
 4661.0
    5.0
 1725.751
  391.5
19061.0
11624.7
  353.92

I want to space pad to the left and the right (but no more than necessary).

I want a zero after the decimal if that is all that is after the decimal.

I do not want scientific notation.

..and I do not want to lose any significant digits. (in 353.98000000000002 the 2 is not significant)

Yeah, it's nice to want..

python 2.5's %g, %fx.x, ect are either befuddling me, or can't do it. have not tried import decimal yet. Can't see that numpy does it either (although, the array.__str__ and array.__repr__ are decimal aligned (but sometimes return scientific).

Oh, and speed counts. I'm dealing with big arrays here.

My current solution approaches are:

  1. to str(a) and parse off numpy's brackets
  2. to str(e) each element in the array and split('.') then pad and reconstruct
  3. to a.astype('S'+str(i)) where i is the max(len(str(a))), then pad

It seems like there should be some off-the-shelf solution out there... (but not required)

top suggestion fails with when dtype is float64:

>>> a
array([  5.50056103e+02,   6.77383566e+03,   6.01001513e+05,
         3.55425142e+08,   7.07254875e+05,   8.83174744e+02,
         8.22320510e+01,   4.25076609e+08,   6.28662635e+07,
         1.56503068e+02])
>>> ut0 = re.compile(r'(\d)0+$')
>>> thelist = [ut0.sub(r'\1', "%12f" % x) for x in a]
>>> print '\n'.join(thelist)
  550.056103
 6773.835663
601001.513
355425141.8471
707254.875038
  883.174744
   82.232051
425076608.7676
62866263.55
  156.503068
+2  A: 

Sorry, but after thorough investigation I can't find any way to perform the task you require without a minimum of post-processing (to strip off the trailing zeros you don't want to see); something like:

import re
ut0 = re.compile(r'(\d)0+$')

thelist = [ut0.sub(r'\1', "%12f" % x) for x in a]

print '\n'.join(thelist)

is speedy and concise, but breaks your constraint of being "off-the-shelf" -- it is, instead, a modular combination of general formatting (which almost does what you want but leaves trailing zero you want to hide) and a RE to remove undesired trailing zeros. Practically, I think it does exactly what you require, but your conditions as stated are, I believe, over-constrained.

Edit: original question was edited to specify more significant digits, require no extra leading space beyond what's required for the largest number, and provide a new example (where my previous suggestion, above, doesn't match the desired output). The work of removing leading whitespace that's common to a bunch of strings is best performed with textwrap.dedent -- but that works on a single string (with newlines) while the required output is a list of strings. No problem, we'll just put the lines together, dedent them, and split them up again:

import re
import textwrap

a = [  5.50056103e+02,   6.77383566e+03,   6.01001513e+05,
         3.55425142e+08,   7.07254875e+05,   8.83174744e+02,
         8.22320510e+01,   4.25076609e+08,   6.28662635e+07,
         1.56503068e+02]

thelist = textwrap.dedent(
        '\n'.join(ut0.sub(r'\1', "%20f" % x) for x in a)).splitlines()

print '\n'.join(thelist)

emits:

  550.056103
 6773.83566

601001.513 355425142.0 707254.875 883.174744 82.232051 425076609.0 62866263.5 156.503068

Alex Martelli
I can't guarantee that %12f won't lose significant digits. (I made an edit and changed the way my test arrays are generated to reflect this.) If I increase to %20 or more to guarantee this, then there is simply too much padding to the left. (want the largest value to have no leading spaces) I'll take back-of-the-cupboard solutions too!
bpowah
+1  A: 

Pythons string formatting can both print out only the necessary decimals (with %g) or use a fixed set of decimals (with %f). However, you want to print out only the necessary decimals, except if the number is a whole number, then you want one decimal, and that makes it complex.

This means you would end up with something like:

def printarr(arr):
    for x in array:
        if math.floor(x) == x:
            res = '%.1f' % x
        else:
            res = '%.10g' % x
        print "%*s" % (15-res.find('.')+len(res), res)

This will first create a string either with 1 decimal, if the value is a whole number, or it will print with automatic decimals (but only up to 10 numbers) if it is not a fractional number. Lastly it will print it, adjusted so that the decimal point will be aligned.

Probably, though, numpy actually does what you want, because you typically do want it to be in exponential mode if it's too long.

Lennart Regebro