tags:

views:

2851

answers:

7

I have a column, say PROD_NUM that contains a 'number' that is left padded with zeros. For example 001004569. They are all nine characters long.

I do not use a numeric type because the normal operation on numbers do not make sense on these "numbers" (For example PROD_NUM * 2 does not make any sense.) And sense they are all the same length, the column is defined as a CHAR(9)

CREATE TABLE PRODUCT (
    PROD_NUM CHAR(9) NOT NULL
    -- ETC.
)

I would like to constrain PROD_NUM so it can only contain nine digits. No spaces, no other characters besides '0' through '9'

A: 

Cast it to integer, cast it back to varchar, and check that it equals the original string?

Joel Coehoorn
(with leading zeros added back)
RedFilter
Ah, good point.
Joel Coehoorn
A: 

In MSSQL, I might use something like this as the constraint test:

PROD_NUM NOT LIKE '%[^0-9]%'

I'm not an Oracle person, but I don't think they support bracketed character lists.

richardtallent
Oracle's LIKE operator does not support regular expressions.
Jeffrey Kemp
Well there is regexp_like.
Theo
A: 

Not sure about performance but if you know the range, the following will work. Uses a CHECK constraint at the time of creating the DDL.

alter table test add jz2 varchar2(4)
     check ( jz2 between 1 and 2000000 );

as will

alter table test add jz2 varchar2(4)
     check ( jz2 in (1,2,3)  );

this will also work

alter table test add jz2 varchar2(4)
         check ( jz2 > 0  );
blispr
-1, this fails the OP requirement that numeric digits only are stored in the column, e.g., "insert into test values ('1.1');
DCookie
+6  A: 
REGEXP_LIKE(PROD_NUM, '^[[:digit:]]{9}$')
Tim Sylvester
Why the two sets of square brackets around :digit: instead of just one set.
Shannon Severance
One set indicates a character class, two indicates a named character class. See http://en.wikipedia.org/wiki/Regular_expression#POSIX_character_classes
Tim Sylvester
One small caveat: REGEXP_LIKE doesn't make use of indexes like a LIKE operator would. This is fine for the OP's problem (an INSERT/UPDATE constraint), but may not provide adequate performance when searching records after the fact.
richardtallent
@rihardtallent: indexes are never used by check constraints anyway.
Jeffrey Kemp
+8  A: 

You already received some nice answers on how to continue on your current path. Please allow me to suggest a different path: use a number(9,0) datatype instead.

Reasons:

  • You don't need an additional check constraint to confirm it contains a real number.

  • You are not fooling the optimizer. For example, how many prod_num's are "BETWEEN '000000009' and '000000010'"? Lots of character strings fit in there. Whereas "prod_num between 9 and 10" obviously selects only two numbers. Cardinalities will be better, leading to better execution plans.

  • You are not fooling future colleagues who have to maintain your code. Naming it "prod_num" will have them automatically assume it contains a number.

Your application can use lpad(to_char(prod_num),9,'0'), preferably exposed in a view.

Regards, Rob.

(update by MH) The comment thread has a discussion which nicely illustrates the various things to consider about this approach. If this topic is interesting you should read them.

Rob van Wijk
You are right!!
Theo
-1. Perhaps Rob's right, perhaps not. We don't know the underlying business process, only the stated requirement. What if future plans were to allow an alpha character in the string when digits ran out? I can change a constraint, but modifying the column data type would be painful. What I would agree with is that the name of the column is poor - it's not a NUM, it's an ID or CODE. Also, the stated solution of NUMBER(9,0) allows negative numbers.
DCookie
@DCookie: are you really suggesting to turn number columns into varchar2 columns, just because a business requirement may possibly change in the future? Or am I misinterpreting you? My course of action -if I want to be prepared for such a situation- is to not make it the primary key, but introduce a surrogate key. This way I can change the datatype relatively easy by either (rename table, introduce new table, change foreign keys to refer new table, drop renamed table) or (backup column data, update column to null, change datatype, update column with backup data).
Rob van Wijk
@Rob, you raise many good points. However the world is filled with "numbers" that aren't numbers, but are instead codes or IDs. VINs (Vehicle Identification Numbers) and many serial numbers come to mind. The users have decided to call these product numbers and decided that they want them to be zero padded on the left. At some layer of the application I will have to have a product number that is a zero padded string. I would rather be consistant all the way to the base table.
Shannon Severance
@Rob, I'd like to respond point by point. You raise good things to think about but ultimately I disagree that I should change approach. "You don't need an additional check constraint to confirm it contains a real number." 1) An additional check constraint is not a burden, if I get it right. 2) as DCookie points out I would need a check constraint to inforce non-negative numbers.
Shannon Severance
"You are not fooling the optimizer. For example, how many prod_num's are "BETWEEN '000000009' and '000000010'"?" Yes, but 1) I don't think a range search on product numbers makes any sense. 2) And if a range search did make sense, it would be done by the users as an adhoc query against the view with a hidden lpad(... etc). Since the comparison would not be against the value indexed, but against a function on that index, would the index be used? Would I need to materilize that view and index on it? Etc.
Shannon Severance
"You are not fooling future colleagues who have to maintain your code." This is a hard one. I think it is tilting at windmills to try to undo the other common use of the word number to mean "Code or ID." As I said, product number is the business/user name for the code. Maintence developers will have to learn the business languange to support the users.
Shannon Severance
"Your application can use lpad(to_char(prod_num),9,'0'), preferably exposed in a view." And so that the lpad(...) column is meaningful to end users it will be named product_num. Great now the type changes depending on where in the system it is going to be.
Shannon Severance
@Shannon: the purpose of my answer is just to make you aware of possible pitfalls when choosing the CHAR(9) approach. If none of them apply in your case, all is fine. And at least, it made you think and you made an informed decision. Oh, and yes, DCookie and you are right about the extra needed check constraint (prod_num > 0) in my proposal.
Rob van Wijk
@Rob, I'm suggesting no such thing. You are the one suggesting it's a number field, when the requirements given explicitly state otherwise. It's a character field, consisting of numeric digits. Your plan of action for a change is exactly what I said it was - a pain, which could conceivably cause unacceptable downtime.
DCookie
@Rob: "the purpose...is just to make you aware of possible pitfalls when choosing the CHAR(9) approach.". Aren't you simply introducing your own possible pitfalls?
DCookie
@DCookie: "A string to contain only digits" sounds like a requirement for a number field to me. The fact that it uses a format mask (left padding with zeros) for display, doesn't change that. We can of course agree to disagree on this. I guess I was just surprised by your downvote, which is something I reserve for incorrect or dangerous advice. Hardly applicable here, I'd say.
Rob van Wijk
@Rob, the fact is, the requirements state it's not a number. A number can be mathematically manipulated. You're calling it a number to enforce a constraint, that's all. Making it a number possibly introduces application issues, e.g., suppose I code my program to stuff that "numeric" field into a C or java int? Now my app breaks if I decide to allow alphas. And I believe it IS incorrect AND dangerous advice to use a NUMBER as you suggest - it leads people to believe the field is something it is not.
DCookie
Okay, one last comment and I'll shut up. I am currently working at a site where this exact scenario has played out. A centrally important "number" to our system started out being all numeric digits. Satellite systems began assuming numbers; so did the apps. Then, our customer says, "oh, BTW, we want you to prefix your "numbers" with two alpha chars. Guess what? Years down the road and we still feel the pain. ALWAYS LOOK DOWN THE ROAD. CONSIDER THE WORST CASE. Don't assume that because things look like numbers today that they must always be numbers. Externalities count!
DCookie
Okay, please allow one last comment from me as well. It is my belief that one has to choose the datatype that constraints the data just right. Just because it is not mathematically manipulated doesn't mean it's not a number. I'm not going to calculate with my surrogate key numbers, not with my telephone numbers, and not with my Tour de France rider numbers. I still consider them numbers because that constraints them properly. I'll never advice declaring such columns VARCHAR2(N) (4000?) or even a CLOB, to be prepared for everything that might happen to them. I use a view layer for that.
Rob van Wijk
@Rob, "the purpose of my answer is just to make you aware of possible pitfalls when choosing the CHAR(9) approach." And thank you for that. I wanted to show that I had considered the issues you raised. And I liked the answer enough to up-vote it.
Shannon Severance
+1  A: 

I think Codebender's regexp will work fine but I suspect it is a bit slow.

You can do (untested)

replace(translate(prod_num,'0123456789','NNNNNNNNNN'),'N',null) is null

Theo
Similar but a little shorter: `TRANSLATE( prod_num, 'A0123456789','A') IS NULL`.
Dave Costa
+3  A: 

Works in all versions:

TRANSLATE(PROD_NUM,'123456789','000000000') = '000000000'
David Aldridge
+1, fast, efficient, effective.
DCookie