views:

99

answers:

4

Apache's StringUtils.isNumeric() method specification says:
Checks if the String contains only unicode digits. A decimal point is not a unicode digit and returns false. Null will return false. An empty String ("") will return true.

Is this logically right? Why do they see empty string as numeric?

+8  A: 

Is this logically right?

Yes, it's correct, since all characters in the empty string are unicode digits. (Or equivalently, no characters in the empty string are not unicode digits.)

It is what logicians call "vacuously true". It's like saying that all elephants in my apartment are green. It's true, since there are no elephants in my apartment.

Why do they see empty string as numeric?

The spec doesn't say that the string represents a number. It says that the string contains only unicode digits.

You say,

I'm confused because specification says: "Checks if the String contains only unicode digits." I don't see that "" contains digits....

A string contains only unicode digits, if and only if it does not contain non-unicode digits. The empty string clearly does not contain non-unicode digits, therefore it contains only unicode digits.

aioobe
You could just aswell argue that none of the characters in the empty string are unicode digits.
klausbyskov
Nice....I was just about to say that....
The Elite Gentleman
Yes! Exactly! I don't see any unicode digits in empty string as well!
Andriy Sholokh
It occurs to me it's probably for input validation: you can have separate "is numeric" and "is empty" validation checks and by this definition the two are completely orthogonal.
Rup
Well... From this point of view: "It is what logicians call "vacuously true". It's like saying that all elephants in my apartment are green. It's true, since there are no elephants in my apartment." - specification is correct. But this is very arguing for me... This is the same if I say: "This is true that Joseph a millioner" and if Joseph does not have money at all... Funny :)
Andriy Sholokh
Yes... This is very correct statement "A string contains only unicode digits, if and only if it does not contain non-unicode digits. The empty string clearly does not contain non-unicode digits, therefore it contains only unicode digits."... But could also say "string contains only non-digit characters, if and only if it does not contain digit characters, therefore "" -contains non-digit characters" :) Such a confuse as for me :)
Andriy Sholokh
Well, not really. You could however say that the empty string contains only digit-characters, and only non-digit characters. (It contains only characters that represent digits and non-digits. No such characters exists, but still all characters in the empty string satisfy such requirement.)
aioobe
Please, see my answer below
Andriy Sholokh
A: 

An empty string satisfies all constraints. (I forgot where I said it before...)

But here's what Wikipedia says:

From String:

The empty string is the unique string over Σ of length 0, and is denoted ε or λ.

And Empty String:

The empty string is a syntactically valid representation of zero in positional notation (in any base), which does not contain leading zeros. Because handling of empty strings is problematical (particularly in a graphic environment), the zero number traditionally represented by one decimal digit 0 instead.

In essence, "an empty string is digitally represented as a zero".

The Elite Gentleman
I'm confused because specification says: "Checks if the String contains only unicode digits." I don't see that "" contains digits....
Andriy Sholokh
`is denoted ε or λ.`Or "".
Carl Manaster
Mathematically, "" is denoted by ε or λ. if you follow the 2nd statement, a string of length 0 is represented by a 0. 0 is numeric and thus it passes.
The Elite Gentleman
Please, see my answer below
Andriy Sholokh
@The Elite Gentleman, What do you mean by "*An empty string satisfies all constraints.*"? I easily come up with constraints that are not met by the empty string... I suspect I misunderstood your wording...
aioobe
@aioobe, for the lack of better word, i used constraints (I didn't know if I should have used `condition`. In AI, an empty solution set (in classification algorithms) satisfies every problem in the search space", I didn't know how to use it in Strings and maths.
The Elite Gentleman
*In AI, an empty solution set (in classification algorithms) satisfies every problem in the search space*... Isn't this a bit far from Java-strings? I think your answer would be better if you explained the analogy a bit more...
aioobe
@aioobe, it is but that theorem was taken from a mathmatical theorem (which i don't quite remember). I therefore, took the that approach to explain why an empty string satisfy all integer values (hence my 2 quotes posted above).
The Elite Gentleman
A: 

java.lang.Integer.parseInt("") will fail.

It's not a matter of logic. It's not a matter of common sense either - there wasn't any number that's represented by no symbol. There's no strong argument why an empty string should represent 0.

If the method name is containsOnlyNumeric(), it is natural to return true for "" according to our math textbooks. However, the method name is isNumeric(), the treatment of "" isn't natural. Also, there's no apparent reason why null should return false. I would throw exception for null.

But it is what it is, it is well documented and what more can you ask for?

irreputable
Please, see my answer below
Andriy Sholokh
A: 

There was not only me who asked this question :) People were opening this defect in Apache's Jira: https://issues.apache.org/jira/browse/LANG-428

They closed without fixing it only to keep backwards compatibility (to follow method specification).

But everybody agreed that current behavior of method is wrong.

Andriy Sholokh
Quote from the link you gave: "An empty String has no characters, so cannot contain an illegal character."....hence why it's an acceptable behaviour.
The Elite Gentleman
But it also does not contain legal character. One guy asked a reasonable question in comments: "Maybe the method could be better named isNotNonNumeric()?" I 100% agree with him. My point is... If to write "17"+"256" - everything is clear... but ""+"256"... well... Empty string should not be considered as some digit, either 0 or 1 or something else.
Andriy Sholokh
I agree. The name of the function is slightly misleading and unfortunate. It could for instance have been called `containsOnlyDigits` or `isNotNonNumeric` as you suggest. There are a few surprises like this regarding the empty string. For example, would you say that an empty string ever contains anything? That is, should `"".contains(.....)` ever evaluate to true? Well it does! (I'll leave it as an exercise for you to figure out how ;) You'll just have to get used to these, shall we say "counter-intuitive properties" of the empty string.
aioobe