views:

490

answers:

3

I came across omegle.com webiste.I know that picks two random strangers and pairs them up and then can chat. In short, i know the functionality of it. But i want to create a similar project as my final year project, so i want to understand the technical details of it.

Can any one suggest which will be the best technology for devloping such kind of websites?
(please don't suggest me to develop it in the language i am familiar with )

The prime focus which selecting the technology is that the application sholud be able to scale up. By scale up i mean it shold be able to handle increasing users and it shold not crash or slow down.

Is python suitable of developing it? I had read some where that that website was devloped using python and a framework called twistedmatrix. (http://twistedmatrix.com/trac/).

+2  A: 

There is no technology in existence that will magically make a web application scalable. Scalability only happens if you get your system architecture right.

Since this is a term project, you should be focused on satisfying the course requirements for the project ... and on improving your design and coding skills. Avoid setting yourself unrealistic goals ... like implementing a highly scalable web service in a totally new (to you) language and framework in a tight time-frame.

Stephen C
+3  A: 

So, the primary technical issues relate to how you maintain the communications for some number of paired endpoints. You mention a website and it seems evident that you'd like for this service to be accessed via normal, common web browsers. In other words it sounds like you're constrained to and AJAX solution.

(Theoretically you could fall back to an even simpler HTML/form driven CGI model ... with meta-refresh tags driving the chat updates; however that would be pretty clunky by comparison to the preponderance of "Web 2.0" offerings that are all around us and thus a poor choice for an academic project. Having your implementation gracefully fall back to this model as necessary would be a nice touch; but the focus has to be on something with a Web 2.0 "AJAX" feel).

The reason I'm elaborating on this point is that a purely peer-to-peer model would be far easier to scale. In that model you'd distribute clients (something like Yahoo Messenger) and your web service would simply be the broker through which these would be connected. You could then arrange the pairings (and perform whatever logging or other scrivening you like on the back end).

Alas that might not be possible with a purely browser based application. This is due to some of the safeguards which are supposed to be implemented by your browser, constraints on how the embedded Javascript engine is allowed to establish network connections. (You might be able to pull it off despite these features, more on that later).

So let's start with the basics ... you have a basic HTML page and form to introduce your users to the service; then you find a pairing for them and arrange to relay messages from one to the other. Thus every time a user in one of them types a line (or even just a character) in their window/frame it's sent to your server (as an AJAX/XML request) and then its queued/buffered for relay to the other party during their next update. (My knowledge of AJAX is a little weak on this point: you may have to arrange for your clients to poll the server whenever they've been idle for some amount of time ... so that the user who is "listening" without typing still gets updates --- or it may be possible for the local Javascript to maintain an open connection and do a blocking read on that socket for events; that's left as a matter for your research).

I hope you start to see where the scalability and performance concerns arise. A modest web server can handle hundreds of simple HTTP requests per second ... and dozens of dynamic ones. If you were able to establish peer-to-peer connections then your scalability would be bounded by the rate of new connections (independent of the number of ongoing conversations). However, if your server has to relay every line (or worse, every character) that's being exchanged then this will greatly increase the load on your servers.

I can offer some rough estimates of server capacity without even looking at code or a design. Those would be approximate upper bound capacities. For example a well tuned Linux server might be able to handle 10,000 concurrent TCP/IP connections. That's assuming minimal CPU and memory overhead ... just simply grabbing bits from one socket (incoming buffer), performing a mapping and stuffing them into another socket (outbound).

Something like Nagle's Algorithm might be used --- but the OS' TCP/IP stack might not be a good place for that to happen if you need to wrap your keystroke events in tags in order to conform to well-formed XML requirements of existing AJAX frameworks and libraries. So you might have to introduce your own delay buffering within your application's own input handler. (Not an issue if you go for a line, oriented chat model rather than character-by-character).

To scale beyond that you'd need to fan out to multiple web servers. The most straightforward approach would be to use HTML/HTTP redirection to a farm of servers ... with the constraint that the random chat session would be established with someone else who was directed to that same server. (Queue up a host of considerations regarding server load balancing, consistent hashing techniques and so on).

One possibility you could explore would be to see if you could package up a Firefox "add-on" --- something like Chatzilla. If your code could work with such an add-on (and perhaps optionally fall back to the AJAX driven approach to accommodate those pesky users who won't install your package) ... then you should be able to make your clients do all the work; leaving your servers to just handle new connections and any of your scrivener's requirements). Obviously to cater to your IE users you'd have to offer an ActiveX control and I don't even know what you'd do for Safari, Opera or other users (other than fall back to the basics).

Jim Dennis
A: 

It's easier than you think. I've built similar stuff (but with more advanced features tough) Protip: Don't make it more advanced than it is.

Joakim