tags:

views:

190

answers:

2

I have a fairly large database with with a column that has strings that are for the most part really just ints, e.g. "1234" or "345". However some of them have strings prepended to them (of varying length), so e.g. "a123" or "abc123".

Is there a smart way to create a new column with just the integer values? Thus, "abc123" would become "123"? I know I can read all of the rows in PHP and then use a regex to do it pretty easily but I wanted to see if there was a way to let SQL do this for me.

+1  A: 

Unfortunately you can't do this in MySQL just by itself. MySQL has regex matching capabilities, but no regex replacement capabilities. Your best option would be to use a regex in PHP to perform your replacement. (Source)

$sql = "SELECT `id`, `myMixedColumn` FROM `myTable` "
     . "WHERE `myMixedColumn` NOT RLIKE '^[[:digit:]]+$'";
$r = mysql_query($sql);

$updates = array();
while ($row = mysql_fetch_assoc($r)) {
    $updates = sprintf("UPDATE `myTable` SET `myIntField` = %s WHERE `id` = %d",
        preg_replace("@\\D@", "", $row['myMixedColumn']),
        $row['id']
    );
}
nickf
+1  A: 

If you really want to do it in SQL, just for the sake of not doing in PHP, you could make a small function that does it until MySQL implements a regex replace, but I wouldn't bet anything on the performance.

Something like that would only work if the letters were at the beginning and that there would be no other characters than [a-zA-Z]. And you'd have to check how it runs in different charsets.

CREATE FUNCTION last_letter(s VARCHAR(100)) RETURNS INT
BEGIN
  DECLARE last, current INT default 0;
  DECLARE letter_a INT;
  DECLARE letter_z INT;
  DECLARE letter_iter INT;
  SELECT ord('a') INTO letter_a;
  SELECT ord('z') INTO letter_z;
  SET letter_iter = letter_a;
  # Will loop for all letters a to z
  WHILE letter_iter <= letter_z DO
    # Will get the last case-insensitive occurrence of a letter
    SELECT LOCATE(CHAR(letter_iter), REVERSE(LOWER(s))) INTO current;
    IF current > 0 THEN
      SELECT LENGTH(s) - current + 1 INTO current;
    END IF;
    # Was that the rightmost letter?
    IF current > last THEN
      SET last = current;
    END IF;
    SET letter_iter = letter_iter + 1;
  END WHILE;
  # Return the max we found
  RETURN last;
END; //

And then to get the integer values:

UPDATE test_table SET int_result = 
  CAST(SUBSTR(str_value, last_letter(str_value) + 1) AS SIGNED);
lpfavreau
yikes! kudos for figuring that out, but crikey... do it in php!
nickf
Yes, as nickf says and as stated before, you might have more speed and flexibility using PHP. Although, as you already knew it would be easy with PHP and wanted to know if you could do it in SQL, here's your answer! :)
lpfavreau