1. Depends on the webserver (and sometimes configuration of such). A description of various models:
Apache with mpm_prefork (default on unix): Process per request. To minimize startup time, Apache keeps a pool of idle processes waiting to handle new requests (which you configure the size of). When a new request comes in, the master process delegates it to an available worker, otherwise spawns up a new one. If 100 requests came in, unless you had 100 idle workers, some forking would need to be done to handle the load. If the number of idle processes exceeds the MaxSpare value, some will be reaped after finishing requests until there are only so many idle processes.
Apache with mpm_event, mpm_worker, mpm_winnt: Thread per request. Similarly, apache keeps a pool of idle threads in most situations, also configurable. (A small detail, but functionally the same: mpm_worker runs several processes, each of which is multi-threaded).
Nginx/Lighttpd: These are lightweight event-based servers which use select()/epoll()/poll() to multiplex a number of sockets without needing multiple threads or processes. Through very careful coding and use of non-blocking APIs, they can scale to thousands of simultaneous requests on commodity hardware, provided available bandwidth and correctly configured file-descriptor limits. The caveat is that implementing traditional embedded scripting languages is almost impossible within the server context, this would negate most of the benefits. Both support FastCGI however for external scripting languages.
2. Depends on the language, or in some languages, on which deployment model you use. Some server configurations only allow certain deployment models.
Apache mod_php, mod_perl, mod_python: These modules run a separate interpreter for each apache worker. Most of these cannot work with mpm_worker very well (due to various issues with threadsafety in client code), thus they are mostly limited to forking models. That means that for each apache process, you have a php/perl/python interpreter running inside. This severely increases memory footprint: if a given apache worker would normally take about 4MB of memory on your system, one with PHP may take 15mb and one with Python may take 20-40MB for an average application. Some of this will be shared memory between processes, but in general, these models are very difficult to scale very large.
Apache (supported configurations), Lighttpd, CGI: This is mostly a dying-off method of hosting. The issue with CGI is that not only do you fork a new process for handling requests, you do so for -every- request, not just when you need to increase load. With the dynamic languages of today having a rather large startup time, this creates not only a lot of work for your webserver, but significantly increases page load time. A small perl script might be fine to run as CGI, but a large python, ruby, or java application is rather unwieldy. In the case of Java, you might be waiting a second or more just for app startup, only to have to do it all again on the next request.
All web servers, FastCGI/SCGI/AJP: This is the 'external' hosting model of running dynamic languages. There are a whole list of interesting variations, but the gist is that your application listens on some sort of socket, and the web server handles an HTTP request, then sends it via another protocol to the socket, only for dynamic pages (static pages are usually handled directly by the webserver).
This confers many advantages, because you will need less dynamic workers than you need the ability to handle connections. If for every 100 requests, half are for static files such as images, CSS, etc, and furthermore if most dynamic requests are short, you might get by with 20 dynamic workers handling 100 simultaneous clients. That is, since the normal use of a given webserver keep-alive connection is 80% idle, your dynamic interpreters can be handling requests from other clients. This is much better than the mod_php/python/perl approach, where when your user is loading a CSS file or not loading anything at all, your interpreter sits there using memory and not doing any work.
Apache mod_wsgi: This specifically applies to hosting python, but it takes some of the advantages of webserver-hosted apps (easy configuration) and external hosting (process multiplexing). When you run it in daemon mode, mod_wsgi only delegates requests to your daemon workers when needed, and thus 4 daemons might be able to handle 100 simultaneous users (depends on your site and its workload)
Phusion Passenger: Passenger is an apache hosting system that is mostly for hosting ruby apps, and like mod_wsgi provides advantages of both external and webserver-managed hosting.
3. Again, I will split the question based on hosting models for where this is applicable.
mod_php, mod_python, mod_perl: Only the C libraries of your application will generally be shared at all between apache workers. This is because apache forks first, then loads up your dynamic code (which due to subtleties, is mostly not able to use shared pages). Interpreters do not communicate with each other within this model. No global variables are generally shared. In the case of mod_python, you can have globals stay between requests within a process, but not across processes. This can lead to some very weird behaviours (browsers rarely keep the same connection forever, and most open several to a given website) so be very careful with how you use globals. Use something like memcached or a database or files for things like session storage and other cache bits that need to be shared.
FastCGI/SCGI/AJP/Proxied HTTP: Because your application is essentially a server in and of itself, this depends on the language the server is written in (usually the same language as your code, but not always) and various factors. For example, most Java deployment use a thread-per-request. Python and its "flup" FastCGI library can run in either prefork or threaded mode, but since Python and its GIL are limiting, you will likely get the best performance from prefork.
mod_wsgi/passenger: mod_wsgi in server mode can be configured how it handles things, but I would recommend you give it a fixed number of processes. You want to keep your python code in memory, spun up and ready to go. This is the best approach to keeping latency predictable and low.
In almost all models mentioned above, the lifetime of a process/thread is longer than a single request. Most setups follow some variation on the apache model: Keep some spare workers around, spawn up more when needed, reap when there are too many, based on a few configurable limits. Most of these setups -do not- destroy a process after a request, though some may clear out the application code (such as in the case of PHP fastcgi).
4. If you say "the web server can only handle 100 requests" it depends on whether you mean the actual webserver itself or the dynamic portion of the webserver. There is also a difference between actual and functional limits.
In the case of Apache for example, you will configure a maximum number of workers (connections). If this number of connections was 100 and was reached, no more connections will be accepted by apache until someone disconnects. With keep-alive enabled, those 100 connections may stay open for a long time, much longer than a single request, and those other 900 people waiting on requests will probably time out.
If you do have limits high enough, you can accept all those users. Even with the most lightweight apache however, the cost is about 2-3mb per worker, so with apache alone you might be talking 3gb+ of memory just to handle the connections, not to mention other possibly limited OS resources like process ids, file descriptors, and buffers, and this is before considering your application code.
For lighttpd/Nginx, they can handle a large number of connections (thousands) in a tiny memory footprint, often just a few megs per thousand connections (depends on factors like buffers and how async IO apis are set up). If we go on the assumption most your connections are keep-alive and 80% (or more) idle, this is very good, as you are not wasting dynamic process time or a whole lot of memory.
In any external hosted model (mod_wsgi/fastcgi/ajp/proxied http), say you only have 10 workers and 1000 users make a request, your webserver will queue up the requests to your dynamic workers. This is ideal: if your requests return quickly you can keep handling a much larger user load without needing more workers. Usually the premium is memory or DB connections, and by queueing you can serve a lot more users with the same resources, rather than denying some users.
Be careful: say you have one page which builds a report or does a search and takes several seconds, and a whole lot of users tie up workers with this: someone wanting to load your front page may be queued for a few seconds while all those long-running requests complete. Alternatives are using a separate pool of workers to handle URLs to your reporting app section, or doing reporting separately (like in a background job) and then polling its completion later. Lots of options there, but require you to put some thought into your application.
5. Most people using apache who need to handle a lot of simultaneous users, for reasons of high memory footprint, turn keep-alive off. Or Apache with keep-alive turned on, with a short keep-alive time limit, say 10 seconds (so you can get your front page and images/CSS in a single page load). If you truly need to scale to 1000 connections or more and want keep-alive, you will want to look at Nginx/lighttpd and other lightweight event-based servers.
It might be noted that if you do want apache (for configuration ease of use, or need to host certain setups) you can put Nginx in front of apache, using HTTP proxying. This will allow Nginx to handle keep-alive connections (and, preferably, static files) and apache to handle only the grunt work. Nginx also happens to be better than apache at writing logfiles, interestingly. For a production deployment, we have been very happy with nginx in front of apache(with mod_wsgi in this instance). The apache does not do any access logging, nor does it handle static files, allowing us to disable a large number of the modules inside apache to keep it small footprint.
I've mostly answered this already, but no, if you have a long connection it doesn't have to have any bearing on how long the interpreter runs (as long as you are using external hosted application, which by now should be clear is vastly superior). So if you want to use comet, and a long keep-alive (which is usually a good thing, if you can handle it) consider the nginx.
Bonus FastCGI Question You mention that fastcgi can multiplex within a single connection. This is supported by the protocol indeed (I believe the concept is known as "channels"), so that in theory a single socket can handle lots of connections. However, it is not a required feature of fastcgi implementors, and in actuality I do not believe there is a single server which uses this. Most fastcgi responders don't use this feature either, because implementing this is very difficult. Most webservers will make only one request across a given fastcgi socket at a time, then make the next across that socket. So you often just have one fastcgi socket per process/thread.
Whether your fastcgi application uses processing or threading (and whether you implement it via a "master" process accepting connections and delegating or just lots of processes each doing their own thing) is up to you; and varies based on capabilities of your programming language and OS too. In most cases, whatever is the default the library uses should be fine, but be prepared to do some benchmarking and tuning of parameters.
As to shared state, I recommend you pretend that any traditional uses of in-process shared state do not exist: even if they may work now, you may have to split your dynamic workers across multiple machines later. For state like shopping carts, etc; the db may be the best option, session-login info can be kept in securecookies, and for temporary state something akin to memcached is pretty neat. The less you have reliant on features that share data (the "shared-nothing" approach) the bigger you can scale in the future.
Postscript: I have written and deployed a whole lot of dynamic applications in the whole scope of setups above: all of the webservers listed above, and everything in the range of PHP/Python/Ruby/Java. I have extensively tested (using both benchmarking and real-world observation) the methods, and the results are sometimes surprising: less is often more. Once you've moved away from hosting your code in the webserver process, You often can get away with a very small number of FastCGI/Mongrel/mod_wsgi/etc workers. It depends on how much time your application stays in the DB, but it's very often the case that more processes than 2*number of CPU's will not actually gain you anything.