views:

594

answers:

8

Usually declaring variables on assignment is considered a best practice in VBScript or JavaScript , for example, although it is allowed.

Why does Python force you to create the variable only when you use it? Since Python is case sensitive can't it cause bugs because you misspelled a variable's name?

How would you avoid such a situation?

+9  A: 

In python it helps to think of declaring variables as binding values to names.

Try not to misspell them, or you will have new ones (assuming you are talking about assignment statements - referencing them will cause an exception).

If you are talking about instance variables, you won't be able to use them afterwards.

For example, if you had a class myclass and in its __init__ method wrote self.myvar = 0, then trying to reference self.myvare will cause an error, rather than give you a default value.

danben
+1 what you said about misspelling made me laugh
lamas
Could you explain why referencing to self.myvar will cause an error?
the_drow
Referencing `self.myvar` will NOT cause an error, because it is assigned in `init`. Referencing `self.myvare` OUTSIDE OF `init` (I should have been more explicit about that) will cause an error, because it was not assigned in `init`.
danben
`class Foo():` `def __init__(self): self.bar = 2`; `foo = Foo()`; `foo.bard = 3` raises no exception in Python 2.6.2 (tested). `dir(foo)` becomes `['__doc__', '__init__', '__module__', 'bar', 'bard']`.
badp
Ah, you're right - it works the same way as a local var. Ok, I amend my last statement to be "referencing (other than assignment)"
danben
@danben, I think "accessing" might even be a better word for referencing. I can reference a name that's undefined for assignment. To borrow from C++, an undefined name in python is, by default, perfectly valid for an lvalue, but will raise either an AttributeError or a NameError, depending on context, when the undefined name is part of an rvalue expression.
Nathan Ernst
+5  A: 

Python never forces you to create a variable only when you use it. You can always bind None to a name and then use the name elsewhere later.

Ignacio Vazquez-Abrams
-1 from me as I don't think you're really answering the question. If you edit to show how this could prevent the problems the OP's worried about, I'll +2.
j_random_hacker
+3  A: 

If you do any serious development you'll use a (integrated) development environment. Pylint will be part of it and tell you all your misspellings. No need to make such a feature part of the langauge.

THC4k
Pylint does help. However the use of Pylint or an IDE for serious development is not a given.
batbrat
+1 for Pylint, +0 for "No need to make such a feature part of the language".
j_random_hacker
+3  A: 

To avoid a situation with misspelling variable names, I use a text-editor with an autocompletion function and binded

 python -c "import py_compile; py_compile.compile('{filename}')"

to a function to be called when I save a file.

Vestel
+2  A: 

Test.

Example, with file variable.py:

#! /usr/bin/python

somevar = 5

Then, make file variable.txt (to hold the tests):

>>> import variables
>>> variables.somevar == 4
True

Then do:

python -m doctest variable.txt

And get:

**********************************************************************
File "variables.txt", line 2, in variables.test
Failed example:
    variables.somevar == 4
Expected:
    True
Got:
    False
**********************************************************************
1 items had failures:
   1 of   2 in variables.test
***Test Failed*** 1 failures.

This shows a variable declared incorrectly.

Try:

>>> import variables
>>> variables.someothervar == 5
True

Note that the variable is not named the same.

**********************************************************************
File "variables.test", line 2, in variables.test
Failed example:
    variables.someothervar == 5
Exception raised:
    Traceback (most recent call last):
      File "/usr/local/lib/python2.6/doctest.py", line 1241, in __run
        compileflags, 1) in test.globs
      File "<doctest variables.test[1]>", line 1, in <module>
        variables.someothervar == 5
    AttributeError: 'module' object has no attribute 'someothervar'
**********************************************************************
1 items had failures:
   1 of   2 in variables.test
***Test Failed*** 1 failures.

This shows a misspelled variable.

>>> import variables
>>> variables.somevar == 5
True

And this returns with no error.

I've done enough VBScript development to know that typos are a problem in variable name, and enough VBScript development to know that Option Explicit is a crutch at best. (<- 12 years of ASP VBScript experience taught me that the hard way.)

Christopher Mahan
+6  A: 

It's a silly artifact of Python's inspiration by "teaching languages", and it serves to make the language more accessible by removing the stumbling block of "declaration" entirely. For whatever reason (probably represented as "simplicity"), Python never gained an optional stricture like VB's "Option Explicit" to introduce mandatory declarations. Yes, it can be a source of bugs, but as the other answers here demonstrate, good coders can develop habits that allow them to compensate for pretty much any shortcoming in the language -- and as shortcomings go, this is a pretty minor one.

hobbs
I hope this answer wins. Clearly the answer is "You can't do that in Python, but it's less of a big deal than you probably think."
j_random_hacker
+10  A: 

If you want a class with "locked-down" instance attributes, it's not hard to make one, e.g.:

class LockedDown(object):
  __locked = False
  def __setattr__(self, name, value):
    if self.__locked:
      if name[:2] != '__' and name not in self.__dict__:
        raise ValueError("Can't set attribute %r" % name)
    object.__setattr__(self, name, value)
  def _dolock(self):
    self.__locked = True

class Example(LockedDown):
  def __init__(self):
    self.mistakes = 0
    self._dolock()
  def onemore(self):
    self.mistakes += 1
    print self.mistakes
  def reset(self):
    self.mitsakes = 0

x = Example()
for i in range(3): x.onemore()
x.reset()

As you'll see, the calls to x.onemore work just fine, but reset raises an exception because of the mis-spelling of the attribute as mitsakes. The rules of engagement here are that __init__ must set all attributes to initial values, then call self._dolock() to forbid any further addition of attributes. I'm exempting "super-private" attributes (ones starting with __), which stylistically should be used very rarely, for totally specific roles, and with extremely limited scope (making it trivial to spot typos in the super-careful inspection that's needed anyway to confirm the need for super-privacy), but that's a stylistic choice, easy to reverse; similarly for the choice to make the locked-down state "irreversible" (by "normal" means -- i.e. requiring very explicit workaround to bypass).

This doesn't apply to other kinds of names, such as function-local ones; again, no big deal because each function should be very small, and is a totally self-contained scope, trivially easy to inspect (if you write 100-lines functions, you have other, serious problems;-).

Is this worth the bother? No, because semi-decent unit tests should obviously catch all such typos with the greatest of ease, as a natural side effect of thoroughly exercising the class's functionality. In other words, it's not as if you need to have more unit tests just to catch the typos: the unit tests you need anyway to catch trivial semantic errors (off-by-one, +1 where -1 is meant, etc., etc.) will already catch all typos, too.

Robert Martin and Bruce Eckel both articulated this point 7 years ago in separate and independent articles -- Eckel's blog is temporarily down right now, but Martin's right here, and when Eckel's site revives the article should be here. The thesis is controversial (Jeff Attwood and his commenters debate it here, for example), but it's interesting to note that Martin and Eckel are both well-known experts of static languages such as C++ and Java (albeit with love affairs, respectively, with Ruby and Python), and they're far from the only ones to have discovered the importance of unit-tests... and how a good unit-tests suite, as a side effect, makes a static language's rigidity redundant.

By the way, one way to check your test suites is "error injection": systematically go over your codebase introducing one mis-spelling -- run the tests to make sure they do fail, if they don't add one that does fail, correct the spelling mistake, repeat. Can be fairly well automated (not the "add a test" part, but the finding of potential errors that aren't covered by the suite), as can some other forms of error injections (change every integer constant, one by one, to one more, and to one less; change each < to <= etc; swap each if and while condition to its reverse; ...), while other forms of error-injection yet require a lot more human savvy. Unfortunately I don't know of publicly available suites of error injection frameworks (for any language) -- might make a cool open source project;-).

Alex Martelli
another gem of an answer.
telliott99
@telliott, thanks!-)
Alex Martelli
Error injection is a great idea. But I disagree that the obvious usefulness of unit tests makes strong typing useless. Why not ask the compiler to do some basic checks for you, so you can find a class of bugs (sure, not *all* bugs) earlier and with less effort? Popular strongly-typed languages are verbose because they aren't great at inferring types automatically -- but languages like OCaml and Haskell show that it doesn't have to be that way.
j_random_hacker
@j_random, have you read Martin's and Eckel's essays? Can't summarize them within one comment, but the point is: you'd have found those bugs extremely early anyway (with just the same set of tests you need **anyway**), so the advantage of the rigid languages is minimal, and the effort to comply with their restriction is **not** less than just not having to worry about them (even in good functiona languages such as the two you mention, which have totally negligible market shares in the real world, the effort is **still** > 0).
Alex Martelli
Just read Martin's essay. Yes, enough tests will catch these bugs. But for ultra-simple code (e.g. getters/setters) I no longer write tests, as it halves my productivity -- it literally doubles the # of lines needing to be written. If that seems "slack", recognise that you can't test everything -- you *must* focus on areas likely to contain bugs. (E.g. most people don't bother testing that `x.setFoo(y)` leaves `x.bar` unchanged.) I'd rather let the compiler do the drudge work for me where it can by finding typos. (And yes I consider writing a declaration less effort than writing a test).
j_random_hacker
@j_random, I don't write getters and setters unless they have important logic -- and then of course I do need to test that logic. In Python (and Ruby, etc) there's no need to write getters, setters and other boilerplate: the code you write is code that DOES things, and therefore it needs testing of its logic. As for "likely to contain bugs", no compiler is going to test that `setFoo`'s nontrivial logic is missing assignments to `bar`, any more than for any other nontrivial method. So just use a language that doesn't MAKE you write anything trivial, and **test** what you do write!
Alex Martelli
Totally agree with avoiding boilerplate wherever possible, my setFoo() example was just an attempt to pre-empt a kneejerk response of "You should test *everything*" -- an argument which happily you didn't make. Another thing I do is embed "compile-time asserts" in C++ library code that I develop, so that if I accidentally misuse this code in the future my mistake will be caught automatically at the earliest possible stage -- compile time. Is it fair to say that by your reasoning, this practice is also a waste of time?
j_random_hacker
And, come to think of it, there is no reason why a language (compiled or not) should bother to report syntax errors, since all errors will be caught by unit tests -- correct?
j_random_hacker
+1, but if I may: I'd suggest extracting this logic to a metaclass instead of a base. I like the solution, but think it makes more sense as a metaclass instead of a base - you're changing the way built-in type infrastructure works, not really offering a functional change, but a meta-change. ;)
Nathan Ernst
@Nathan, custom metaclasses should be used only if they give added value: here, they would not (could do little more than injecting those two methods, just like inheritance does). Since metaclasses are most often and Pythonically obtained by subclassing a class (that does nothing but set its `__metaclass__`), why do you think there would be any advantage whatsoever to implementing that class with a level of indirection via a metaclass, rather than simply and directly as I do? So, total disagreement here. `__setattr__` is a fully normal part of Python, nothing at all meta about it.
Alex Martelli
A: 

Variable declaration does not prevent bugs. Any more than lack of variable declaration causes bugs.

Variable declarations prevent one specific type of bug, but it creates other types bugs.

Prevent. Writing code where there's an attempt to set (or change) a variable with the wrong type of data.

Causes. Stupid workarounds to coerce a number of unrelated types together so that assignments will "just work". Example: The C language union. Also, variable declarations force us to use casts. Which also forces us to suppress warnings on casts at compile time because we "know" it will "just work". And it doesn't.

Lack of variable declarations does not cause bugs. The most common "threat scenario" is some kind of "mis-assignment" to a variable.

  1. Was the variable being "reused"? This is dumb but legal and works.

  2. Was some part of the program incorrectly assigning the wrong type?

    That leads to a subtle question of "what does wrong mean?" In a duck-typed language, wrong means "Doesn't offer the right methods or attributes." Which is still nebulous. Specifically, it means "the type will be asked to provide a method or attribute it doesn't have." Which will raise an exception and the program will stop.

Raising an uncaught exception in production use is annoying and shows a lack of quality. It's stupid, but it's also a detected, known failure mode with a traceback to the exact root cause.

"can't it cause bugs because you misspelled a variable's name"

Yes. It can.

But consider this Java code.

public static void maine( String[] argv ) {
    int main;
    int mian;
}

A misspelling here is equally fatal. Statically typed Java has done nothing to prevent a misspelled variable name from causing a bug.

S.Lott
(1) No-one's saying that declarations prevent *all* mistakes due to misspellings. (2) C/C++'s half-assed type system is an (efficiency-motivated) hack, sure, but don't paint all strongly typed languages with that brush. E.g. Java breaks your "Causes" argument. (3) Don't you think it's better to discover a type error sooner (at compile time) rather than later (at runtime)? (4) I like duck-typed languages too, but that doesn't mean strong typing has no value.
j_random_hacker
"can't it cause bugs because you misspelled a variable's name" is faulty. I'm not saying the alternatives are perfect. I'm saying that the assumption at the top of this question is wrong.
S.Lott