tags:

views:

691

answers:

8

For instance

import org.apache.nutch.plugin.Extension,

though used many times,

I've no much idea what is done essentially.

EDIT: Is org.apache.nutch.plugin essentially 4 directories or fewer than 4 like a directory named org.apache?

+7  A: 

All it's doing is saving you typing. Instead of having to type "org.apache.nutch.plugin.Extension" every time you want to use it, the import allows you to refer to it by its short name, "Extension".

Don't be confused by the word "import" - it's not loading the .class file or anything like that. The class loader will search for it on the CLASSPATH and load it into perm space the first time your code requires it.

UPDATE: As a developer you have to know that packages are associated with directories. If you create a package "com.foo.bar.baz" in your .java file, it'll have to be stored in a directory com/foo/bar/baz.

But when you download a JAR file, like that Apache Nutch library, there are no directories involved from your point of view. The person who created the JAR had to zip up the proper directory structure, which you can see as the path to the .class file if you open the JAR using WinZip. You just have to put that JAR in the CLASSPATH for your app when you compile and run.

duffymo
I mean, is "org.apache.nutch.plugin" essentially 4 directories?
Shore
+1 People often forget "import" is just shorthand.
William Brendel
@Shore - The designers of Java did identify packages with directory hierarchies as a convention. Another way to think of them is simply as namespaces. You and I can both develop classes named "Foo", as long as we can distinguish them using packages/namespaces. I"m not certain, but I believe that's the way C++ and C# use them.
duffymo
+2  A: 

Imports are just hints to the compiler telling him how to figure out the full name of classes.

So if you have "import java.util.*;" and in your code you are doing something like "new ArrayList()", when the compiler processes this expression it first needs to find the fully qualified name of the type ArrayList. It does so by going thru the list of imports and appending ArrayList to each import. Specifically, when it appends ArrayList to java.util it get the FQN java.util.ArrayList. It then looks up this FQN in its class-path. If it finds a class with such a name then it knows that java.util.ArrayList is the correct name.

Itay
I know about that,though:(
Shore
A: 

Basically when you make a class you can declare it to be part of a package. I personally don't have much experience with doing packages. However, afaik, that basically means that you are importing the Extension class from the org.apache.nutch.plugin package.

Thomas
So what's the relation tween package and directory?
Shore
http://en.wikipedia.org/wiki/Java_packages
Thomas
A: 

Buliding off of Thomas' answer, org.apache.nutch.plugin is a path to the class file(s) you want to import. I'm not sure about this particular package, but generally you'll have a .jar file that you add to your classpath, and your import statement points to the directory "./[classpath]/[jarfile]/org/apache/nutch/plugin"

Norm MacLennan
Is it possible like this:./[classpath]/[jarfile]/org.apache/nutch/plugin?
Shore
+1  A: 

is "org.apache.nutch.plugin" essentially 4 directories?

If you have a class whose name is org.apache.nutch.plugin.Extension, then it is stored somewhere in the classpath as a file org/apache/nutch/plugin/Extension.class. So the root directory contains four nested subdirectories ("org", "apache", "nutch", "plugin") which in turn contain the class file.

Esko Luontola
Is it possible that org.apache is a single directory instead of two?
Shore
No. Each word in the package name is its own directory.
Esko Luontola
+1  A: 

import org.apache.nutch.plugin.Extension is a compilation time shortcut that allows you to refer to the Extension class without using the class' fully qualified name. It has no meaning at runtime, it's only a compilation time trick to save typing.

By convention the .class file for this class will be located in folder org/apache/nutch/plugin either in the file system or in a jar file, either of which need to be in your classpath, both at compile time and runtime. If the .class file is in a jar file then that jar file needs to be in your classpath. If the .class file is in a folder, then the folder that is the parent of folder "org" needs to be in your classpath. For example, if the class was located in folder c:\myproject\bin\org\apache\nutch\plugin then folder c:\myproject\bin would need to be part of the classpath.

If you're interested in finding out where the class was loaded from when you run your program, use the -verbose:class java command line option. It should tell you which folder or jar file the JVM found the class.

Francois Gravel
thank you for your reply,but now my focus is this:Is it possible that org.apache is a single directory instead of two? seems no one has a definite answer?
Shore
you cannot use package directories or file names with . in them.
Peter Lawrey
A: 

you can't have a directory named org.apache as a package. the compiler won't understand that name and will look for the directory structure org/apache when you import any class from that package.

also, do not mistake the Java import statement with the C #include preprocessor instruction. the import statement is, like they've said, a shorthand for you to type fewer characters when referring to a class name.

cd1
Any reference to prove "can't have a directory named org.apache as a package. the compiler won't understand that name"?
Shore
There's nothing like trying it yourself to prove it. Spoiler: CD1 is correct.
William Brendel
as William said, I just tried it here, the compiler didn't recognize my import statement.
cd1
A: 

I think the question you might be trying to ask is, "What are packages in Java, and how does the import keyword relate to them?". Your confusion about directory structures might stem from the fact that some other languages have include directives that use file names to literally include the contents of the specified file in your source code at compile time. C/C++ are examples of languages that use this type of include directive. Java's import keyword does not work this way. As others have said, the import keyword is simply a shorthand way to reference one or more classes in a package. The real work is done by the Java Virtual Machine's class loader (details below).

Let's start with the definition of a "Java package", as described in the Wikipedia article:

A Java package is a mechanism for organizing Java classes into namespaces similar to the modules of Modula. Java packages can be stored in compressed files called JAR files, allowing classes to download faster as a group rather than one at a time. Programmers also typically use packages to organize classes belonging to the same category or providing similar functionality.

In Java, source code files for classes are in fact organized by directories, but the method by which the Java Virtual Machine (JVM) locates the classes is different from languages like C/C++.

Suppose in your source code you have a package named "com.foo.bar", and within that package you have a class named "MyClass". At compile time, the location of that class's source code in the file system must be {source}/com/foo/bar/MyClass.java, where {source} is the root of the source tree you are compiling.

One difference between Java and languages like C/C++ is the concept of a class loader. In fact, the concept of a class loader is a key part of the Java Virtual Machine's architecture. The job of the class loader is to locate and load any class files your program needs. The "primordial" or "default" Java class loader is usually provided by the JVM. It is a regular class of type ClassLoader, and contains a method called loadClass() with the following definition:

// Loads the class with the specified name.
// Example: loadClass("org.apache.nutch.plugin.Extension")
Class loadClass(String name)

This loadClass() method will attempt to locate the class file for the class with given name, and it produces a Class object which has a newInstance() method capable of instantiating the class.

Where does the class loader search for the class file? In the JVM's class path. The class path is simply a list of locations where class files can be found. These locations can be directories containing class files. It can even contain jar files, which can themselves contain even more class files. The default class loader is capable of looking inside these jar files to search for class files. As a side note, you could implement your own class loader to, for example, allow network locations (or any other location) to be searched for class files.

So, now we know that whether or not "com.foo.bar.MyClass" is in a class file in your own source tree or a class file inside a jar file somewhere in your class path, the class loader will find it for you, if it exists. If it does not exist, you will get a ClassNotFoundException.

And now to address the import keyword: I will reference the following example:

import com.foo.bar.MyClass;

...

public void someFunction() {
    MyClass obj1 = new MyClass();
    org.blah.MyClass obj2 = new org.blah.MyClass("some string argument");
}

The first line is simply a way to tell the compiler "Whenever you see a variable declared simply as type MyClass, assume I mean com.foo.bar.MyClass. That is what's happening in the case of obj1. In the case of obj2, you are explicitly telling the compiler "I don't want the class com.foo.bar.MyClass, I actually want org.blah.MyClass". So the import keyword is just a simple way of cutting down on the amount of typing programmers have to do in order to use other classes. All of the interesting stuff is done in the JVM's class loader.

For more information about exactly what the class loader does, I recommend reading an article called The Basics of Java Class Loaders

William Brendel