views:

192

answers:

5

Is it possible to find a defined sequence in an integer without converting it to a string? That is, is it possible to do some form of pattern matching directly on integers. I have not thought of one but I keeping thinking there should be a mathematical way of doing this. That's not to say it is more efficient.

(edit) I actually what numbers that don't contain the sequences of digits I am looking for.

The integers will be large, at least 289 digits. The sequences to find could be anything, "123", "5"(there is a five), "66666"

I am interested in a general solution but if you would like to help with the acutal problem I am trying to sovle keep reading.

More specifically I am looking for repeating digits of length 4 ie 1324322223313 "2222". I am staring with integers because I will be incrementing though consecutive integers unless I get to an integer with 4 length repeat then I would skip to the the next integer without the repeat. Also I don't what integers with digit larger that 4 ie 12322135 (it has a 5) would be excluded.

The problem might also be stated as. Find all integers in the z = range(x,y) such that z[a] does not contain any repeating digits of length 4 and a digit larger than 4. The range(x,y) may be very large

(Edit) in response to the comment, Yes I would actually like to generate a list, the problem I have is that I am not sure how I could make a generator that satisfies all the conditions I have. Maybe I should think about this more, I agree it would be simpler, but it might be similar to a generator for prime numbers, there is no such generator.

A: 

Maybe you want to take a look here: Cyclic Numbers; they also have an algorithm to build a cyclic number.

This can be useful too: Cycle detection

Rubens Farias
+1  A: 

A list of digits is pretty simple.

# given n, a long integer
digits = [] 
while n != 0:
    digits.append( n%10 )
    n //= 10
digits.reverse()

You can then do your pattern matching on this list of digits. Is that what you're looking for?

S.Lott
interesting solution to geting the integer to a list. I not sure this is better than str(n) and the pattern matching on that. Is it possible to do patter matching directly on integers? I guess I get better at asking my question as I read comments and solutions
Vincent
Isn't the simpler way to get a list of digits in the string just list(str(n)) ?
Ned Batchelder
+3  A: 

You can use this class to have your generator of digits :-)

import math

class DecimalIndexing:
    def __init__(self, n):
        self.n = n
    def __len__(self):
        return int(math.ceil(math.log10(self.n)))
    def __getitem__(self, i):
        if isinstance(i, slice):
            return [self[x] for x in range(i.start, i.stop, i.step or 1)]
        else:
            return (self.n/(10**i))%10
    def __iter__(self):
        for i in xrange(len(self)):
            yield self[i]

you can use it like this:

di = DecimalIndexing(31415927)
for i in xrange(len(di)):
    if di[i:i+4] == [9,5,1,4]:
        print "found"

or like this:

for i in xrange(len(di)):
    if di[i:i+3] == [di[i]]*3:
        print "group of three equal digits at," i

or like this:

if 5 in di:
    print "has a five"

or like this:

if any(x > 5 in di):
    print "some digit was greater than five"

etc.

Keep in mind that the digits indices are "reversed", i.e. read from right to left.

fortran
Thanks for the instruction manual :)
Vincent
A: 

You can make an iterator with the digits ordered from left to right this way

>>> import math
>>> number = int(123456789012345678901)
>>> #Get the maximum power of 10 using a logarithm
>>> max_digit = int(math.log10(number))
>>> range_pow = xrange(max_digit, 0, -1)
>>> # pot is an iterator with 1000, 100, 10, 1...
>>> pot = ( 10**x for x in range_pow)
>>> #Get the digits one by one on an iterator
>>> digits = ( (number/x)%10 for x in pot )
>>> l = list(digits)
>>> print l
[1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 0L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 0L]

Then you can check if the sequence is present... I'm looking for a easy way to do that through the iterator, something like a state machine to parse the result, but I'm not sure if there's a built-in way to do it without making a list or making the finite state machine by yourself...

You can go with something like this, but I think it will kill the performance (compared with a finite state parsing done on low level over the iterator) as you need to build the list, not working directly with iterator:

>>> print l
[1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 0L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 0L]
>>> find = [1,2,3]
>>> lf = len(find)
>>> for i in xrange(len(l)):
...     if find == l[i:i+lf]:
...          print 'Found!', i
Found! 1
Found! 11

Edited: I've come with a more iterative way to do things... The digits parameter could be refined to create a list from a number, if necessary.

import math
from itertools import count

def find_digits_in_number(digits, number):
    #Get the maximum power of 10 using a logarithm
    max_digit = int(math.log10(number))
    range_pow = xrange(max_digit, -1, -1)
    # pot is an iterator with 1000, 100, 10, 1...
    pot = (10 ** x for x in range_pow)
    #Get the digits one by one on an iterator
    dig = ((number / x) % 10 for x in pot)

    #Current will store a moving windows with the 
    #size of the digits length to check if present
    current = []
    for i in digits:
        current.append(next(dig))

    digits = list(digits) 

    founds = []
    #The basic loop is this...
    #for digit, i in zip(dig, count()):
    #    if current == digits:
    #        founds.append(i)
    #    current.pop(0)
    #    current.append(digit)

    #But it can also be optimized like this list comprehension, 
    #while it's much less readable            
    [ (founds.append(i) if current == digits else None,\
      current.pop(0), current.append(digit)) \
      for digit, i in zip(dig, count()) ]

    #Check last posibility, with the last values
    if current == digits:
        founds.append(i + 1)

    return founds


if __name__ == '__main__':
    assert find_digits_in_number((3, 4, 5), 123456789012345678901) == [2, 12]
    assert find_digits_in_number((3, 4), 123456789034) == [2, 10]
Khelben
A: 

@Fortran gives a great solution, it is very versatile.

I ask a modified version on mathoverflow.net, They didn't seem to like it but I got a great answer. This does answer a question slightly different from what I ask here but it is very useful to me.

so to find the test if the digits 4444 are in 35344442345321456754 and assuming I know where I what to look for them then this is a nice solution and obvious once you see it.

(35344442345321456754 / 10**13) % 10**4 == 4444
Vincent