views:

1216

answers:

8

I haven't done enterprise work in Java, but I often see the reverse-domain-name package naming convention. For example, for a Stack Overflow Java package you'd put your code underneath package com.stackoverflow.

I just ran across a Python package that uses the Java-like convention, and I wasn't sure what the arguments for and against it are, or whether they apply to Python in the same way as Java. What are the reasons you'd prefer one over the other? Do those reasons apply across the languages?

+5  A: 

The idea is to keep name spaces conflict free. Instead of unreadable UUIDs or the like, the reverse domain name is unlikely to get in someone else's way. Very simple, but pragmatic. Moreover when using 3rd party libs it might give you a clue as to where they came from (for updates, support etc.)

Daniel Schneller
Ah, I had forgotten Java has no mechanism for import remapping ( http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4983159 ), hence the possibility of collision. That may be the reason.
cdleary
Hmm, that would be nice!import my.code.Enquiry;import their.code.Enquiry as TheirEnquiry;wouldn't necessarily even have variable name/keyword clashes in the class body as it's in the import section.
JeeBee
+9  A: 

It's a great way of preventing name collisions, and takes full advantage of the existing domain name system, so it requires no additional bureaucracy or registration. It is simple and brilliant.

By reversing the domain name it also gives it a hierarchical structure, which is handy. So you can have sub-packages on the end.

The only downside is the length of the name, but to me that is not a downside at all. I think it is a pretty good idea for any language that would support it.

Why don't JavaScript libraries do it, for example? Their global namespace is a big problem, yet Javascript libraries use simple global identifiers like '$' which clash with other Javascript libraries.

thomasrutter
JavaScript is a totally separate issue. Python has file-level module scope and an import mechanism, which JavaScript does not.
cdleary
Ah I wasn't aware Python could do that. The comment about Javascript is still relevant though. Objects in Javascript inhabit a global scope and so do package names in Java, so would benefit from some sane naming convention to prevent collisions.
thomasrutter
Right, that makes sense -- the naming convention is used to avoid collisions in Java package imports. But is it necessary/useful in Python?
cdleary
Sorry I don't know enough about Python to comment - I tried reading this http://www.network-theory.co.uk/docs/pytut/PythonScopesandNameSpaces.html and it seems to indicate that it wouldn't be necessary in Python, at least for modules.
thomasrutter
A: 

Java is able to do it like this, since it is a recommended Java standard practice, and pretty much universally accepted by the Java community. Python dos not have this convention.

Avi
I thought python was perfect? :)
willcodejavaforfood
Python *is* perfect; Java doesn't really need this, they just elected to do it. ;-)
S.Lott
Python is perfect. It's just that Java is using a wrong convention ;)
Oli
S.Lott, you beated me by 2 seconds.
Oli
Yep -- my question was asking *why*.
cdleary
willcodejavaforfood
@avi - where's the reasoning your answer does not give any benefits or any technical rationale for this approach...
mP
@mP: The question was about conventions. There are definite rationales in favor of both approaches. A good one for Java is that it makes a nice clear way to navigate packages, and see clearly where each one belongs.
Avi
I'm not sure why I keep being voted down on this - does anyone have any reasons?
Avi
+10  A: 

"What are the reasons you'd prefer one over the other?"

Python's style is simpler. Java's style allows same-name products from different organizations.

"Do those reasons apply across the languages?"

Yes. You can easily have top level Python packages named "com", "org", "mil", "net", "edu" and "gov" and put your packages as subpackages in these.

Edit. You have some complexity when you do this, because everyone has to cooperate and not pollute these top-level packages with their own cruft.

Python didn't start doing that because the namespace collision -- as a practical matter -- turn out to be rather rare.

Java started out doing that because the folks who developed Java foresaw lots of people cluelessly choosing the same name for their packages and needing to sort out the collisions and ownership issues.

Java folks didn't foresee the Open Source community picking weird off-the-wall unique names to avoid name collisions. Everyone who writes an xml parser, interestingly, doesn't call it "parser". They seem to call it "Saxon" or "Xalan" or something completely strange.

S.Lott
+1 True, I've never had a conflict between two same-named packages. It's possible that you have two packages named `django`, though, and it would probably be tough to import both simultaneously. It looks like it may be *impossible* in Java since it doesn't have import name remapping.
cdleary
Given Python's filesystem-to-package mapping, using top-level packages like "com" and "org" would in fact *create* namespace collisions.
Jeff Shannon
@Jeff S: That sounds like the start of a great answer!
cdleary
@Jeff S But isn't that why Python gives you tighter control over the namespace when you're *importing* modules? import com as com2, from com import sub as sub1. IMO, this is much better than having to fall back on fully qualified class names.
Joe Holloway
@jholloway7: See my edited answer about `com/__init__.py` for why import aliases don't actually help here.
Jeff Shannon
@Jeff S That makes sense. Thanks
Joe Holloway
+2  A: 

Python does have it, it's just a much flatter hierarchy. Look at os.path, for example. And there's nothing stopping library designers making much deeper ones, e.g. Django.

Fundamentally, I think Python is designed on the idea that you want to get stuff done without having to specify or type too much in advance. This greatly helps with scripting and command-line use. There are several parts of 'The Zen of Python' that address the rationale for this:

  • Simple is better than complex.
  • Flat is better than nested.
  • Beautiful is better than ugly. (The Java system looks ugly to me.)

On the other hand, there's:

  • Namespaces are one honking great idea -- let's do more of those!
Edmund
I think you're answering the *title* of the question rather than the contents -- I'm really wondering about the reverse-domain-name hierarchies.
cdleary
+17  A: 

Python doesn't do this because you end up with a problem -- who owns the "com" package that almost everything else is a subpackage of? Python's method of establishing package heirarchy (through the filesystem heirarchy) does not play well with this convention at all. Java can get away with it because package heirarchy is defined by the structure of the string literals fed to the 'package' statement, so there doesn't need to be an explicit "com" package anywhere.

There's also the question of what to do if you want to publicly release a package but don't own a domain name that's suitable for bodging into the package name, or if you end up changing (or losing) your domain name for some reason. (Do later updates need a different package name? How do you know that com.nifty_consultants.nifty_utility is a newer version of com.joe_blow_software.nifty_utility? Or, conversely, how do you know that it's not a newer version? If you miss your domain renewal and the name gets snatched by a domain camper, and someone else buys the name from them, and they want to publicly release software packages, should they then use the same name that you had already used?)

Domain names and software package names, it seems to me, address two entirely different problems, and have entirely different complicating factors. I personally dislike Java's convention because (IMHO) it violates separation of concerns. Avoiding namespace collisions is nice and all, but I hate the thought of my software's namespace being defined by (and dependent on) the marketing department's interaction with some third-party bureaucracy.

To clarify my point further, in response to JeeBee's comment: In Python, a package is a directory containing an __init__.py file (and presumably one or more module files). A package hierarchy requires that each higher-level package be a full, legitimate package. If two packages (especially from different vendors, but even not-directly-related packages from the same vendor) share a top-level package name, whether that name is 'com' or 'web' or 'utils' or whatever, each one must provide an __init__.py for that top-level package. We must also assume that these packages are likely to be installed in the same place in the directory tree, i.e. site-packages/[pkg]/[subpkg]. The filesystem thus enforces that there is only one [pkg]/__init__.py -- so which one wins? There is not (and cannot be) a general-case correct answer to that question. Nor can we reasonably merge the two files together. Since we can't know what another package might need to do in that __init__.py, subpackages sharing a top-level package cannot be assumed to work when both are installed unless they are specifically written to be compatible with each other (at least in this one file). This would be a distribution nightmare and would pretty much invalidate the entire point of nesting packages. This is not specific to reverse-domain-name package hierarchies, though they provide the most obvious bad example and (IMO) are philosophically questionable -- it's really the practical issue of shared top-level packages, rather than the philosophical questions, that are my main concern here.

(On the other hand, a single large package using subpackages to better organize itself is a great idea, since those subpackages are specifically designed to work and live together. This is not so common in Python, though, because a single conceptual package doesn't tend to require a large enough number of files to need the extra layer of organization.)

Jeff Shannon
You don't have to use reverse domain names as the package name (it's a hassle in the UK - do we use "uk.co.company", which is tatty?) It's just one way to set up an informal namespace system, it's worked fine up to now, I just don't see your argument.
JeeBee
+7  A: 

If Guido himself announced that the reverse domain convention ought to be followed, it wouldn't be adopted, unless there were significant changes to the implementation of import in python.

Consider: python searches an import path at run-time with a fail-fast algorithm; java searches a path with an exhaustive algorithm both at compile-time and run-time. Go ahead, try arranging your directories like this:

folder_on_path/
    com/
        __init__.py
        domain1/
            module.py
            __init__.py


other_folder_on_path/
    com/
        __init__.py
        domain2/
            module.py
            __init__.py

Then try:

from com.domain1 import module
from com.domain2 import module

Exactly one of those statements will succeed. Why? Because either folder_on_path or other_folder_on_path comes higher on the search path. When python sees from com. it grabs the first com package it can. If that happens to contain domain1, then the first import will succeed; if not, it throws an ImportError and gives up. Why? Because import must occur at runtime, potentially at any point in the flow of the code (although most often at the beginning). Nobody wants an exhaustive tree-walk at that point to verify that there's no possible match. It assumes that if it finds a package named com, it is the com package.

Moreover, python doesn't distinguish between the following statements:

from com import domain1
from com.domain1 import module
from com.domain1.module import variable

The concept of verifying that com is the com is going to be different in each case. In java, you really only have to deal with the second case, and that can be accomplished by walking through the file system (I guess an advantage of naming classes and files the same). In python, if you tried to accomplish import with nothing but file system assistance, the first case could (almost) be transparently the same (init.py wouldn't run), the second case could be accomplished, but you would lose the initial running of module.py, but the third case is entirely unattainable. The code has to execute for variable to be available. And this is another main point: import does more than resolve namespaces, it executes code.

Now, you could get away with this if every python package ever distributed required an installation process that searched for the com folder, and then the domain, and so on and so on, but this makes packaging considerably harder, destroys drag-and-drop capability, and makes packaging and all-out nuisance.

David Berger
from com.domain1 import module as d1m # etc.
Ryan Ginstrom
Good point about multiple 'com' packages in different path-directories -- I'd only been considering the problems with sharing a single parent package, but separate packages of the same name is even worse.
Jeff Shannon
Java searches for classes at runtime, not only at compile time, using the directory structure.It's dynamically linked.
Marian
@Marian thanks a lot
David Berger
+5  A: 

Somewhere on Joel on Software, Joel has a comparison between two methods of growing a company: the Ben & Jerry's method, which starts small and grows organically, and the Amazon method of raising a whole lot of money and staking very wide claims from the start.

When Sun introduced Java, it was with fanfare and hype. Java was supposed to take over. Most future relevant software development would be on web-delivered Java applets. There would be brass bands and even ponies. In this context, it was sensible to establish, up front, a naming convention that was internet-based, corporation-friendly, and on a planetary scale.

OK, it didn't turn out quite as Sun hoped, but they planned as if they would succeed. Personally, I despise projects that can be undermined by success.

Python was a project by Guido van Rossum initially, and it was quite some time before the community was confident it would survive if van Rossum was hit by a bus. There were, as far as I know, no initial plans to take over the world, and it was not intended as a web applet language.

Therefore, during the formative stages of the language, there was no reason to want a vast hierarchy for a naming scheme. In that more informal community, one selected a more or less whimsical project name and checked to see if somebody else was already using it. (Naming a computer language after a British comedy show might be considered whimsical just to start.) There was no perceived need to cater to a big but unimaginative and clumsy naming scheme.

David Thornley
+10 for the historical context and a clear perspective. But I can only do a +1 :)
Lakshman Prasad
I'll do my part to support that +10 :)
Jeff Shannon
"Personally, I despise projects that can be undermined by success." I mostly agree (though 'despise' is rather strong), but at the same time, companies/projects that promise salvation and the One True Way give me hives...
Jeff Shannon