Suppose I have 500 jar files linked to my program totaling over 500 MB (size of all the jars, not each one) and my program makes a call to a class located in one of them. How does Java search through jars for a class, and what is the efficiency of this? O(n)? O(log(n))?
By default it used to be linear; however, since JDK 1.3 a JAR index can be embedded in the first JAR file of an application.
This means that if the index is embedded into the JAR file the class loader can efficiently find all the classes distributed over multiple JAR files belonging to the application.
link to SUN Resource on JAR Indexing. Note: non-class resources don't seem to be covered.
Java looks in the internal directory structure of the jar for an exact match on the fully qualified name. It looks; it does not search. If you have 500 jar files on the classpath, Java will look in them one by one in the specified order until it finds a match. If the jar containing a given class is the last one, Java will look in the 500 jar files. So I guess it's O(n).
UPDATE: The behavior described above is the default behavior. However, as Hassan pointed out, this can be optimized by providing an JarIndex in the root jar file allowing the classloader to find the proper jar file with a simple lookup on a package name.
Tom mentions in a comment that jar files have an index at the end. This is worth noting. You mention that the total size of your hypothetical jar files is 500MB, but actually the size is irrelevant. The classloader never has to look through a whole jar file byte by byte. Instead, it seeks to the end of the file, then reads back through the central directory. This is a series of records containing the names of the files in the zip, various metadata, and the offset of that file in the zip file. Here's a brief summary of the PKZIP file format.
So, to search 500 jar files, the classloader would open each one (until a match), and look at the end of it to see if the file is present. It may or may not cache this information, to speed up future searches (things can get really slow if it doesn't, and there are a lot of entries in the path). This whole process is proportional to the total number of files in all the jars, rather than the total size of all the jars. Parsing a JAR index is still proportional to the total number of files in all the jars, but it's a smaller constant since there's less filesystem activity.