tags:

views:

301

answers:

7

It seems that all the major investment banks use C++ in Unix (Linux, Solaris) for their low latency/high frequency server applications. Why is Windows generally not used as a platform for this? Are there technical reasons why Windows can't compete?

+7  A: 

Technically, no. However, there is a very simple business reason: the rest of the financial world runs on Unix. Banks run on AIX, the stock market itself runs on Unix, and therefore, it is simply easier to find programmers in the financial world that are used to a Unix environment, rather than a Windows one.

Ryan Gooler
+1 for not devolving into Unix FUD.
Billy ONeal
It's defenetly NOT about finding guys with windows knowledge. I assure you, OS-related knowledge is 1% of the knowledge required to do that kind of development, and everyone working in this area can switch to ANY os if needed.
BarsMonster
@BarsMonster: I could see though for things like networking stacks, which are completely platform dependent, would require some body of knowledge before being able to switch platforms. Plus things like `fork` which are common in POSIX world but are not possible in a Windows environment. I think it's a reasonable answer.
Billy ONeal
That may be true in a standard environment, but in a very low-latency environment, deep understanding of the OS is a must, in order to be able to keep the overhead of your code down, to use the optimal system calls for a specific task, and to be familiar with the specific quirks and gotchas.
Ryan Gooler
@Billy ONeal: I am not saying that it's easy to port. I am saying that these guys (which are payed well above average) just can gain required knowledge(not only google - paid trainings, expensive consultants who wrote target networking stack on their own :-) ) of needed OS-specific things not matter what is the OS. Surely all OSes have some tiny dirty secrets, but it's NOT the reason to choose specific OS.
BarsMonster
A: 

Linux/UNIX are much more usable for concurrent remote users, making it easier to script around the systems, use standard tools like grep/sed/awk/perl/ruby/less on logs... ssh/scp... all that stuff's just there.

There are also technical issues, for example: to measure elapsed time on Windows you can choose between a set of functions based on the Windows clock tick, and the hardware-based QueryPerformanceCounter(). The former is increments each 10 to 16 milliseconds (note: some documentation implies more precision - e.g. the values from GetSystemTimeAsFileTime() measure to 100ns, but they report the same 100ns edge of the clock tick until it ticks again). The latter - QueryPerformanceCounter() - has show-stopping issues where different cores/cpus can report clocks-since-startup that differ by tens of milliseconds. MSDN documents this as a possible BIOS bug, but it's common. So, who wants to develop low-latency trading systems on a platform that can't be instrumented properly?

Many Linux/UNIX variants have lots of easily tweakable parameters to trade off latency for a single event against average latency under load, time slice sides, scheduling policies etc..

On the FUD/reputation side - somewhat intangible but an important part of the reasons for OS selection - I think most programmers in the industry would just trust Linux/UNIX more to provide reliable scheduling and behaviour. Further, Linux/UNIX has a reputation for crashing less, though Windows is pretty reliable these days, and Linux has a much more volatile code base than Solaris or FreeBSD.

Tony
Windows *client* operating systems only allow one person to use RDP at a time. However Windows Terminal Server has been around forever (it was, in fact, the original use of RDP) and it allows as many connections as you have Client Access Licenses. Windows Server OSs come with the capability to have more than one remote user by default. If you could source the comment about scheduling then I would +1 here -- that part of the answer seems to be FUD at this point to me (the rest of the answer is good). YMMV.
Billy ONeal
There is no UNIX/Linux scheduling. It's one of the areas in which implementations differ. And Linux in fact has had more than one scheduler choice (google Completely Fair Scheduler Linux for background), so you can't even say "Linux scheduling is reliable".
MSalters
@Billy: thanks for the correction re RDP - answer updated appropriately. Have made it clearer what's FUD/opinion, which I still believe to be relevant to the question.@MSalters: That's like saying there is no sport because there's soccer and tennis. UNIX/Linux scheduling can still be addressed collectively. And you can reasonably generalise, just as one can say playing sport is healthy....
Tony
A: 

What's the reported uptime reliability of the various OS's? Would you want your trading system to go down in the middle of a multimillion dollar arbitrage?

hotpaw2
Stop flamewars please. Modern Server OSes from Miscrosoft with _professional_ administration have no reliability issues.
BarsMonster
This is not an answer. And even if you try to pass it off as one, it's simple Unix FUD at that point.
Billy ONeal
I never mentioned a specific OS. Merely that a system designer would require evidence from various other similar installations on the actual reportable uptime and whether it varied between the OS(es) in use.
hotpaw2
+4  A: 

Reason is simple, 10-20 years ago when such systems emerged, "hardcore" multi-CPU servers were ONLY on some sort of UNIX. Windows NT was in kinder-garden these days. So the reason is "historical".

Modern systems might be developed on Windows, it's just a matter of taste these days.

PS: I am currencly working on one of such systems :-)

BarsMonster
+1 for *another* answer not devolving into Unix FUD.
Billy ONeal
+2  A: 

There are a variety of reasons, but the reason is not only historical. In fact, it seems as if more and more server-side financial applications run on *nix these days than ever before (including big names like the London Stock Exchange, who switched from a .NET platform). For client-side or desktop apps, it would be silly to target anything other than Windows, as that is the established platform. However, for server-side apps, most places that I have worked at deploy to *nix.

Rory
Windows certainly was not the established desktop platform in 1990, when such trading systems were first developed. And if you needed serious performance on your desktop, 16 bits Windows was not an option.
MSalters
+3  A: 

The performance requirements on the extremely low-latency systems used for algorithmic trading are extreme. In this environment, microseconds count.

I'm not sure about Solaris, but the case of Linux, these guys are writing and using low-latency patches and customisations for the whole kernel, from the network card drivers on up. It's not that there's a technical reason why that couldn't be done on Windows, but there is a practical/legal one - access to the source code, and the ability to recompile it with changes.

caf
This seems like a good answer, but do you know for a fact that they write low-latency patches and recompile the kernel?
Jon
@Jon: I have only anecdotal evidence, from various discussions on LKML and similar places over the years (eg. Christoph Lameter is a kernel developer who was working on low-latency for such applications for a while).
caf
+5  A: 

(I've worked in investment banking for 8 years) In fact, quite a lot of what banks call low latency is done in Java. And not even Real Time Java - just normal Java with the GC turned off. The main trick here is to make sure you've exercised all of your code enough for the jit to have run before you switch a particular VM into prod ( so you have some startup looping that runs for a couple of minutes - and hot failover).

The reasons for using Linux are:

Familiarity

Remote administration is still better, and also low impact - it will have a minimal effect on the other processes on the machine. Remember, these systems are often co-located at the exchange, so the links to the machines (from you/your support team) will probably be worse than those to your normal datacentres.

Tunability - the ability to set swappiness to 0, get the JVM to preallocate large pages, and other low level tricks is quite useful.

I'm sure you could get Windows to work acceptably, but there is no huge advantage to doing so - as others have said, any employees you poached would have to rediscover all their latency busting tricks rather than just run down a checklist.

rjw