ansaurus

Question

python regular expression to validate types of strings

Answer 1

+1 A:

int() and check for exceptions
float() - but what do you mean float?
int() and then check using if
using datetime formatting

bluszcz 2010-02-02 12:02:50

Answer 2

+2 A:

Why use regex? I'm convinced it would be slower and more cumbersome.

The int() and float() method or better yet the isdigit() method work well here.

a = "03523"
a.isdigit()
>>> True

b = "963spam"
b.isdigit()
>>> False

For question 3, do you mean "Validate if a UTF8 string is a NUMBER of length(1-255)"?

Why not:

def validnumber(n):
  try:
    if 255 > int(n) > 1:
      return True
  except ValueError:
      return False

Dominic Bou-Samra 2010-02-02 12:09:45

Answer 3

+5 A:

Regex is not a good solution here.

Validate if a UTF8 string is an integer:

try:
  int(val)
  is_int = True
except ValueError:
  is_int = False

Validate if a UTF8 string is a float: same as above, but with float().

Validate if a UTF8 string is of length(1-255):

is_of_appropriate_length = 1 <= len(val) <= 255

Validate if a UTF8 string is a valid date: this is not trivial. If you know the right format, you can use time.strptime() like this:

# Validate that the date is in the YYYY-MM-DD format.
import time
try:
  time.strptime(val, '%Y-%m-%d')
  is_in_valid_format= True
except ValueError:
  is_in_valid_format = False

EDIT: Another thing to note. Since you specifically mention UTF-8 strings, it would make sense to decode them into Unicode first. This would be done by:

my_unicode_string = my_utf8_string.decode('utf8')

It is interesting to note that when trying to convert a Unicode string to an integer using int(), for example, you are not limited to the "Western Arabic" numerals used in most of the world. int(u'١٧') and int(u'१७') will correctly decode as 17 even though they are Hindu-Arabic and Devangari numerals respectively.

Max Shawabkeh 2010-02-02 12:12:11

ansaurus

tags:

views:

answers:

python regular expression to validate types of strings

related questions