views:

293

answers:

4

Python, Perl and PHP, all support TCP stream sockets. But exactly how do I use sockets in a script file that is run by a webserver (eg Apache), assuming I only have FTP access and not root access to the machine?

  1. When a client connects to a specific port, how does the script file get invoked?

  2. Does the script stay "running" for the duration of the connection? (could be hours)

  3. So will multiple "instances" of the script be running simultaneously?

  4. Then how can method calls be made from one instance of the script to another?

+6  A: 

Scripting languages utilize sockets exactly the same way as compiled languages.

1) The script typically opens and uses the socket. It's not "run" or "invoked" by the socket, but directly controls it via libraries (typically calling into the native C API for the OS).

2) Yes.

3) Not necessarily. Most modern scripting langauges can handle multiple sockets in one "script" application.

4) N/A, see 3)


Edit in response to change in question and comments:

This is now obvious that you are trying to run this in the context of a hosted server. Typically, if you're using scripting within Apache or a similar server, things work a bit differently. A socket is opened up and maintained by Apache, and it executes your script, passing the relevant data (POST/GET results, etc.) to your script to process. Sockets usually don't come into play when you're dealing with scripting for CGI, etc.

However, this typically happens using the same concepts as mod_cgi. This pretty much means that the script running is nothing but an executable as far as the server is concerned, and the executable's output is what gets returned to the client. In this case, (provided you have permissions and the correct libraries on the server), your python script can actually launch a separate script that does its own socket work completely outside of Apache's context.

It's (usually) not a good idea to run a full socket implementation directly inside of the CGI script, however. CGI will expect the executable to run to completion before it returns results to the client. Apache will sit there and "hang" a bit waiting for this to complete. If you're launching a full server (especially if it's a long running process, which they tend to be), Apache will think the script is locked, and probably abort, potentially killing the process (configuration specific, but most hosting companies do this to prevent scripts from taking over CPU on a shared system).

However, if you execute a new script from within your script, and then return (shutting down the CGI executable), the other script can be left running, working as a server. This would be something like (python example, using the subprocess library):

newProccess = Popen("python MyScript", shell=True)

Note that all of the above really depends a bit on server configuration, though. Many hosting companies don't include some of the socket or shell libraries in their scripting implementations specifically to prevent this, so you often have to revert to making the executable in C. In addition, this is often against terms of service for most hosting companies - you'd have to check yours.

Reed Copsey
How does the script initially "open"? Does it require a page request from a client?
Jenko
Are you asking about how a web-page backed by a PHP script or Python script gets executed? Typically, a web server (such as Apache) processes the HTTP request, figures out what script is responsible and then executes that script, passing the script some information about the request and its context.
Emil
No, just asking how Reed thinks the script should "open" up? Eg. Because a Python script is just a ".py" file on a server. Would we have to configure Apache/Unix in any way to automatically "open" this script at system startup, or would it fire up later somehow?
Jenko
You'd run your Python (or whatever) program just like any other program. If you're talking about what happens when there's a web server like Apache involved, the web server itself invokes your script.
Mike Daniels
Jeremy: you said "no", but the proceeded to answer "yes". The case you're talking about sounds like CGI. In CGI, the script doesn't deal with the socket directly: your web server (eg: Apache) does, and then invokes the script as Emil described. This is actually not a "scripting language" specific thing. You could write a CGI in C, if you wanted to. How you tell your web server which files to execute (rather than treat as documents), and how to execute them depends on the server. For Apache, the mod_cgi docs are a good place to start reading about how this works.
Laurence Gonsalves
@Laurence - Very confusing. So you mean to say that I can't get a script file to remain "running" and use many sockets, if my script is initially invoked via Apache server?
Jenko
+1  A: 

The only way I can make sense of what you're asking is if you use inetd or a similar meta-server, which is configured to invoke your "service a single client" program for a specific listening port, forwarding your "single client servicer" program's stdin/stdout to the remote client.

If that's the case:

1) inetd runs it

2) yes

3) yes

4) named pipes are one possibility

wrang-wrang
+2  A: 

As a prior answer notes, scripting languages have operate in this regard in exactly the same way as compiled programs. Where they differ (potentially) is in the API that they use. The operating system (Windows or Unix-based) offers an API (e.g., BSD sockets) that compiled programs will call directly (typically). Interpreted languages like PHP or Python may offer a different API such as Python's socket API which may simplify some parts of the underlying API.

Given any of these APIs, there are many ways in which the actual handling of an incoming TCP connection can be structured. A great and detailed overview of such approaches is available on the c10k webpage: http://www.kegel.com/c10k.html -- in particular, the section on IO strategies. In short, the choice of answers to your question is up to the programmer and may affect how the resulting program performs under load.

To focus on your specific questions:

  1. Many server programs are started before the connection and are running to listen for incoming connections. A special case is inetd which is a superserver: it listens for connections and then hands off those connections to programs that it starts (specified in a config file).
  2. Typically, yes, the script remains running for the duration of the connection. However, depending on the larger system architecture, the script could conceivably pass the connection off to another program for handling and then exit.
  3. This is a choice, again as enumerated on the c10k page.
  4. This is another choice; operating systems offer a variety of Interprocess Communication (IPC) mechanisms to programs.
Emil
+1  A: 

When a client connects to a specific port, how does the script file get invoked?

The script should be already invoked in order to receive any connects from any client. You will need script to be hanging on there forever (infinie loop) and setup Apache not to kill it on timeout. Basically, PHP is not a good choice for writting server applications. Why do you need this?

FractalizeR
Infinitely looping? you mean I actually have to use a "while(1)"?? But cannot I program it to be asynchronous? (async = the program lives on *without* looping, and reacts on events, such as reception of socket data)
Jenko
asynchronous does not mean that a program is free of `while(1)` or any other particular construct, only that when one activity is stalled, the program is able to work on other things.
TokenMacGuy
Not necessarily infinite looping. You can use persistent sockets and select or stream_select in a script, that is periodically called to see if there something to respond too. But this is complicated as you may find that not all data for request is received and therefore you need to block untill all data is there. So, PHP is not a good choice for a server application after all.
FractalizeR