views:

86

answers:

2

Hi,

Let say I have a directory which is being hosted by Jetty or Apache (i'd like an answer for both), i know the URL including the port and i can log into the server.

How can i find the directory that is being hosted by a certain port?

I'd also like to go the other way, i have a folder on the server, which i know if being hosted, but i don't know the port so i can't find it in a web browser.

How can i find a list of directories that are being hosted?

This has been bugging me for ages but i've never bothered to ask before!

Thanks.

+2  A: 

This is the way how to find it out for Apache. Lets say you have an URL http://myserver.de:8081/somepath/index.html

Step 1: Find the process that has the given port open

You can do this by using lsof in a shell of the server, which lists open files (and ports) as well as the processes associated to it:

myserver:~ # lsof -i -P | grep LISTEN | grep :80
apache2   17479        root    4u  IPv6 6271473       TCP *:80 (LISTEN)

We now know there is a process called "apache2" with process ID 17479

Step 2: Find out more about the process

We can now look at the environment of the process, where more information should be available:

myserver:~ # (cat /proc/17479/environ; echo) | tr "\000" "\n"
PATH=/usr/local/bin:/usr/bin:/bin
PWD=/
LANG=C
SHLVL=1
_=/usr/sbin/apache2

Okey, the process executable is /usr/sbin/apache2. Now lets look at the command line.

myserver:~ # (cat /proc/17479/cmdline; echo) | tr "\000" " "
/usr/sbin/apache2 -k start

Step 3: Finding the config of the process

Our previous examination has shown that no special configuration file has been given at the command line with the -f option, so we have to find the default location for that process. This depends on how the default command line is compiled into the apache2 executable. This could be extracted from it somehow, but obviously its the default location for Apache 2 on my machine (Debian Etch), namely /etc/apache2/apache2.conf.

Step 4: Examining the Apache config file

This again needs some knowledge about apache configurations. The config file can include other files, so we need to find those first:

myserver:~# cat /etc/apache2/apache2.conf | grep -i ^Include
Include /etc/apache2/mods-enabled/*.load
Include /etc/apache2/mods-enabled/*.conf
Include /etc/apache2/httpd.conf
Include /etc/apache2/ports.conf
Include /etc/apache2/conf.d/
Include /etc/apache2/sites-enabled/

A nice list. These configs tell evetything about your configuration, and there are many options that might map files to URLs. In particular apache can serve different directories for different domains, if those domains are all mapped to the same IP. So lets say on your server you host a whole bunch of domains, then "myserver.de" is either mapped by the default configuration or by a configuration that serves this domain specifically.

The most important directives are DocumentRoot, Alias and Redirect. On my system the following gives a quick overview (comments omitted):

myserver:~# ( cat /etc/apache2/apache2.conf; cat /etc/apache2/sites-enabled/* ) | grep 'DocumentRoot\|Alias\|Redirect'
    Alias /icons/ "/usr/share/apache2/icons/"
        DocumentRoot /var/www/
                RedirectMatch ^/$ /apache2-default/
        ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/
    Alias /doc/ "/usr/share/doc/"
        DocumentRoot /var/www/
                RedirectMatch ^/$ /apache2-default/
        ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/
    Alias /doc/ "/usr/share/doc/"

Since the "mypath" part of the URL has no direct match, I can savely assume it lies below the DocumentRoot /var/www/, so the result of my search is that

http://myserver.de:8081/somepath/index.html --> /var/www/mypath/index.html

You can do a lookup in a similar way for jetty.

Daniel
+1  A: 

As a convention you could maintain a document detailing all the filesystem directories and corresponding URLs that are created. This will ansewr quetsions of file>URL and URL->file mappings, and is also useful for planning, resource management and security reviews.

What follows is food for thought rather than any serious proposal. My "proper" answer is to use good documentation. However...

An automated approach might be possible. Thinking freely (not necessarily practically!) you could find/create an Apache module/jetty extension to add a small virtual file to each web directory as it is served from the filesystem. The contents of that virtual file would contain the location of the directory on the server the files are served from as well as maybe internal server name, IP or other details to help bridge the gap from what you see on the web side and where it is in your intranet.

Mapping files to URLs is tricky. You might be able to automate it by scanning the http access logs, which is configured with a custom logger that logs an entry when a file is served. By scanning the URL accessed and corresponding file served, you can map files back to URLs. (You also get the URL->file mapping, in case you don't want to manually browse the URL as I outlined in the paragraph above.)

mdma