I'd like to hear some of the more pernicious 'gotchas' that exist out there. Any language, system, or library is fine.
The doubling up of equality and assignment operators in VB. Most of the time, i'm able to switch back and forth between VB[.NET] and C-like languages without too much friction, but this difference is subtle enough to bite me.
Dim result As Double
Dim auxResult as Double
result = auxResult = input1 * input2 ' D'oh!
Perl's $_
The default magic variable. More than once I've been bitten by the ambiguity of what the value contained, or, more importantly, code that was simple before I accommodated a special case thanks to its magic, became much uglier once I unwittingly changed its value.
The canonical example is assignment within a comparison expression in C. if (alert_code = red) launch_missles ();
and all that. When compilers start warning about using features of a language, is a big sign they shouldn't exist.
In .NET the System.Drawing.Bitmap object has a method called "GetHBitmap()" which returns a native GDI HBITMAP object. The code I was maintaining was a dynamic image generator for a web site that would take a product image and dynamically render a "15% off!" medallion or something on the image.
The System.Drawing stuff in .NET 1.1 was half baked, some methods were managed-only, some methods required native HBITMAPs, etc. So he was calling GetHBitmap() and then disposing of the object. Needless to say, this didn't work and the server would crash every few hours. Upon investigation, we found that the GDI Handle Count for the aspnet_wp.exe (ASP.NET worker process) was astronomical (65,000+ versus the normal 500-800).
Upon investigation, the GetHBitmap() function says that you must call the native GDI "DeleteObject()" Win32 API function to release the GDI handle. This function is not available in .NET and you must make a P/Invoke call to call it.
For the non-.NET'ers reading this, it's akin to making a native C call from Java or PHP. Totally non-intuitive.
It took us a few days to track it down. Once the fix was in place, everything worked great! :)
I strongly recommend reading Java Puzzlers. It's the whole book about gotchas. There is a sample chapter on the webpage.
The best ones are in C++, it lends itself to beatifull things like:
if(value=comparisson)
{
//do something with value and wonder
//why is it allways 100% equal to comparison
}
C# spoiled all the fun on that one. A little off topic, pointers give out cool things too, even two days ago I had to write this gorgeous piece of code:
level = (int)(*((double *)(void.Ptr())));
The Lisp-like parentheses are the result of paranoia after 30 minutes staring wide eyed at a screen when the simpler level = *(int *)void.Ptr(); was not working.
Casting. Especially in weak languages, casting can be one of the biggest gotchas. One of the worst casting situations I ran into (albeit quite rare) was when a string was being cast into a number, but only the first character was being cast into the number. So the string "31" became 3.
Auto-boxing in Java:
Integer n = 128;
Integer m = 128;
assert n <= m; // True
assert n >= m; // True
assert n == m; // False
Or anything involving Java's BigDecimals.
EDIT: Actually, this PHP gotcha is probably worse.
Namespaces in LINQ to XML, mainly because it is my most recent gotcha. Examples of LINQ to XML rarely include a namespace declaration in the sample file, however once you move to a proper live XML file format (like GPX) failing to realise you need to include the namespace in your query leads to much returning of empty results.
In C:
Using math routines such as sin, cos or sqrt and forgetting to include math.h.
On old compilers that will just compile and link without warnings, but not work because C assumes that all parameters are int if no prototype is given. You'll pass an integer to a function expecting a float argument.
In some programming languages (like Java and C#) strings are designed as immutable classes, which means that certain methods don't change the original object but instead return a modified object copy. When starting with Java I used to forget this all the time and wondered why the replace method didn't seem to work on my string object.
String text = "foobar";
text.replace("foo", "super");
System.out.print(text); // still prints "foobar" instead of "superbar"
Perl's glob() iterator. In scalar context, it will return successive results, followed by an undef, even if the argument changes.
$ touch quux
$ touch quuux
$ perl -w
sub globme {
my $pattern = shift;
my $result = glob($pattern);
return $result;
}
print 'got: ', globme('qu*x'), "\n";
print 'got: ', globme('foo'), "\n";
__END__
got: quuux
got: quux
The 'foo' is disregarded because that call to glob() hasn't exhausted its results yet. To correctly use glob() it should always be in list context, even if you only want one result:
my ($result) = glob($pattern);
or care should be taken to make certain that the iterator is called until it returns undef before a fresh iteration is desired.
In Ruby code like:
2 + "4"
"hello no. " + 1
Give type run-time errors. It's easy to fix, but I regularly get these errors when trying to print out debug strings. Sure, it's logical; just not very intuitive in a dynamically typed language. You could always redefine the meaning of +, but that's another story.
This isn't a big one for me anymore, but it bit me when I came across it and I see it hit a lot of people, so it stands out to me. Python allows nested functions and closures, as well as functions as objects. A common idiom to attempt is something like the following:
functions = []
for i in xrange(10):
def f():
return i
functions.append(f)
for f in functions:
print f()
The expectation is that one can create a series of functions, using the outer variables, and later call them in an expected way. However, anyone understanding closures will quickly notice the value of i will be 9 for every single call to the ten individual functions created. The name is looked up in the function's closure when it gets called, and they are all called after the loop, when i still has its last value, which is 9.
The solution to rebinding the variable as a local inside the nested function:
functions = []
for i in xrange(10):
def f(i=i):
return i
functions.append(f)
for f in functions:
print f()
In the new code, the function is defined with a single parameter, i, which is given a default value computed at definition time, so it has the correct value of i.
I once went back and forth between two systems on an hour-to-hour basis. One had an editor where Ctrl-X meant save and exit; on the other, it meant Exit without saving.
Bill Drissel
Default arguments in Python:
def f(x, a = []):
if len(a) == 0:
a.append(x)
print a
>>> f(1)
[1]
>>> f(2, [5])
[5]
>>> f(3)
[1] # you expected [3], right?
By appending to the argument a
when it is not specified, this actually changes the default argument value for subsequent calls! One way to fix this is:
def f(x, a = None):
if a is None:
a = [x]
print a
I worked on a Data General mini-computer that had a couple of quirks in the Fortran compilier. When we first got it, I wrote a sample program about 10 lines long. We could not get it to compile and run properly, which was aggravating because we didn't have a hard disk and each compile/run cycle took about 15 minutes of reading and punching paper tape. I eventually realized that my little program defined a function named "mpy", but the compiler had an internal function of the same name used for multiplication. (In other words, it should have been a reserved word.)
On the same computer/compiler, if a subtraction resulted in zero, it would be a negative zero which would not test equal to a positive zero. IIRC, if you set a floating point number to 6 and printed it out, it was equal to something like 5.99994.
Pretty silly...
someheader.h
START_OF_FILE
7/***********************************************************************************************************************************************************************************
LOTS OF DOCUMENTATION ,LOTS OF DOCUMENTATION ,LOTS OF DOCUMENTATION ,
LOTS OF DOCUMENTATION ,LOTS OF DOCUMENTATION ,LOTS OF DOCUMENTATION ,
LOTS OF DOCUMENTATION ,LOTS OF DOCUMENTATION ,LOTS OF DOCUMENTATION ,
LOTS OF DOCUMENTATION ,LOTS OF DOCUMENTATION ,LOTS OF DOCUMENTATION ,
LOTS OF DOCUMENTATION ,LOTS OF DOCUMENTATION ,LOTS OF DOCUMENTATION ,
LOTS OF DOCUMENTATION ,LOTS OF DOCUMENTATION ,LOTS OF DOCUMENTATION ,
LOTS OF DOCUMENTATION ,
*/
// code...
END_OF_FILE
in a C program composed of MANY files... got 2 days to find where it was, that damned "7" ( syntax check couldn't determine row/file error )
C
Maybe not the worst gotcha I've seen but recently I came across a colleague trying to set a double to an undefined value. He had read about the 0xFFFA5A5A
memory pattern which also can act as a NaN for floats on SGI. How did he use it?
double dvalue = (double)0xFFFA5A5A;
Did he test it? No.
It took me a while to explain all the problems with this line.
Visual C++ 6 getline bug ><
string input;
getline(STD::in,input,'\n');
needs pressing enter twice to work. very annoying, and shouldn't happen in a commercial product.
Objc :
if(string==string2)
doesn't works, it needs
if([string isEqual:string2])
I posted an answer about Java already, but I think this is a better (worse?) gotcha.
PHP
Specifically, the way it handles numeric literals and strings containing numeric representations. The following is copied from my blog posting about it.
"01a4" != "001a4"
We start with something simple and non-controversial. If you have two strings that contain a different number of characters, they can’t be considered equal. The leading zeros are important because these are strings not numbers.
"01e4" == "001e4"
However, PHP doesn’t like strings. It’s looking for any excuse it can find to treat your values as numbers. And here we have it. Change the hexadecimal characters in those strings slightly and suddenly PHP decides that these aren’t strings any more, they are numbers in scientific notation (PHP doesn’t care that you used quotes) and they are equivalent because leading zeros are ignored for numbers. To reinforce this point you will find that PHP also evaluates "01e4" == "10000" as true because these are numbers with equivalent values. This is documented behaviour, it’s just not very sensible.
Enter ===
At this point the PHP apologists chime in with the suggestion to use the === operator. This is an equality operator that compares not only the values of the arguments but their types as well. Both sides must have the same type as well as identical values. This doesn’t seem like it should make any difference as the literals on both side of the comparison already have identical types, regardless of whether that type is string or integer. Of course that’s not the case and when you use the extra equals sign the values remain as strings rather than being interpreted as integers. "01e4" === "001e4" evaluates to false (correct, but not entirely convincing).
"0x001a4" == 0x01a4
So it seems that the rule in PHP is that if the contents of a string can be parsed as a numeric literal then, for comparisons, they are, as we see with the above hexadecimals (note the difference in notation from the first example, i.e. the use of the 0x prefix). Leading zeros are ignored when numbers are involved.
"0012" != 0012
Unfortunately that’s not the full story as the final example shows. Like many other languages, PHP interprets numbers beginning with a zero as octal values, but not when that number is within a string. This is completely inconsistent with the way it processes hexadecimal values and scientific notation within strings.
I recently spent an hour looking for a problem with some documentation I was creating. The build process was complaining about a broken cross reference and it was there in the source file. I thought there were stale files that held the cross referencing information and deleted the entire builds repeatedly. Finally, I found the problem, the source files didn't have the cross references. The editor buffer I was looking at was showing me an outdated version of the file. Once I refreshed the buffer, it was obvious.
ReXX
Using PULL
to read binary data from the external data queue.
PULL
is a short hand way of saying PARSE UPPER PULL
. Of course, once binary data (binary intergers in my case) were converted to upper case, they just wouldn't add up properly any more!
Always read non character data using PARSE PULL
Not the biggest, but today (embarrassingly enough) I was working on some legacy C++ code. I don't this often, so my mind sometimes slips going from C# back to C++. Anyway, one object returned (as a property) an internal object. Remember, it's legacy code I'm working with! :) Should have been a reference. So for the want of an ampersand I spent 3 hours tracking down what appeared to be a memory corruption issue.
this:
Foo Bar::GetFoo()
{
return _foo;
}
should have been:
Foo & Bar::GetFoo()
{
return _foo;
}
Good grief, Charley Brown!
Found it by putting break points on the c-tor and d-tor for Foo. When the the d-tor is called multiple times and c-tor isn't...
I was recently caught out by sequence points in C.
We all know the obvious cases, like that x = x++;
leads to undefined behaviour because you are attempting to modify a variable twice without a sequence point in between. But there are also less obvious cases, such as:
someArray[f()] = g();
g()
can be evaluated either before or after f()
, so if g()
modifies state that influences the return value of f()
, you have undefined behaviour -- which array element gets assigned to will depend on the phase of the moon!
In my case, f()
was actually a macro called SIBLING()
that took a struct tree *
and returned a pointer to the sibling of that node in the tree, and g()
was a function that modified the child pointers in a given tree node. I could not figure out why the assignment seemed to be updating the wrong child. I thought for a long time that the problem must have been caused my SIBLING()'s macroness (since we all know that macros are evil) -- it wasn't till I rewrote it as a function and got exactly the same behaviour that I finally clicked.
I did this once:
<input type="button" onlick="ShowPopup()" value="Go!" />
You've got no idea how stupid I felt when after few good minutes of wondering why my function is not being called, I realized all I need to do is to LICK this button...
You can't use generics in Windows Forms visual inheritance. Seems to me as if this particular feature was "developed" by an intern or something.
JavaScript's lousy scoping for variables in loops.
var functions = [];
for (var i = 0; i < 5; i++)
functions.push(function() { alert(i); });
alert(functions[1]()); // displays 5, instead of the expected 1
There is, of course, a workaround for this but it's 1) not the sort of thing you expect, the first time you do this and 2) the workaround is fugly.