views:

1910

answers:

21

A recruiter has sent me a fairly long quiz from an employer, and I think it's a bit excessive for what is supposedly a Senior Java developer role. The questions (to me) seem unfocused, and go into areas such as architecture, operations, capacity planning. One of the questions is this:

"Internet-scale applications must be designed to process high volumes of transactions. In 400 words or less, describe a design for a system that must process on average 30,000 HTTP requests per second. For each request, the system must perform a lookup into a dictionary of 50 million words, using a key word passed in via the URL query string. Each response will consist of a string containing the definition of the word (100 bytes or less). Describe the major components of the system, and note which components should be custom-built and which components could leverage third-party applications. Include hardware estimates for each component. Please note that we are interested in maximum performance at minimum hardware / software-licensing costs. Please document your rationale in coming up with your estimates. Optional: For extra credit, describe how the design would change if the definitions are 10 kilobytes each. "

Now given that even a site like Twitter is really handling just 11K requests per second (maybe more, I haven't checked lately), it makes me think, "What pr0n are these dudes planning to serve up?" =)

Of course you can come up with a general answer - "memcache the hell out of it", "cluster and load balance" - but hardware estimates and licensing costs? Do you think this is a reasonable question?

A: 

I think this is a reasonable question. I had harder questions in my college courses at CMU. The point is to test your knowledge, and not just to get free ideas.

Justin
Ok, just for starters. - what web server software would YOU use, and why?
Thorbjørn Ravn Andersen
This is a reasonable question for someone to ask you if they're paying you for the answer. Designing a high-availability website is not something many people do for free.
MattC
+4  A: 

This seems like overkill to me, especially with the 400 word limit. You may try using this as an opportunity to focus on what you believe is relevant to your desired role. Your answers will then define what you believe your role should be for the recruiter.

Greg D
+10  A: 

Given the cost angle being put in, that sounds much more like an Senior Systems Engineer type question than a Java developer.

MBCook
The questions they ask are supposed for Software Architect post.
janetsmith
+5  A: 

No this is not a reasonable question.
There are too many unknowns, and if you were interested in a valid answer why limit it to 400 words?

Bravax
+17  A: 

Solving the problem for a 50,000,000 word dictionary is a lot different than the Twitter problem. You can have the entire dictionary sitting in RAM, serve it with thttpd, and handle the load with two computers.

ראובן
if this is accurate, this is a fantastic answer verbatim.
Dustin Getz
It is a tad large, but of course you can build a system where it'll fit: ((50,000,000 * 200) / 1,024) / 1,024 = 9,536.74316
dlamblin
Don't forget, they said they were "words", so you can even do some mild compression on them. And 10GB isn't outrageous these days. I have 16GB in my Mac Pro here....In fact, I suspect this was the answer they were looking for.
ראובן
Very clever. I think the important realization here is spotting that they are asking for a very narrow subset of features compared to a typical web app.
quillbreaker
+2  A: 

Bad question/essay. Consider this, if this company/recruiter can't boil down a massive concept like this one into a few bite-sized questions that can be answered reasonably (as related to the position) what will they have in store for you if they hire you? Run, run fast and far.

shambleh
No, it's not. Open ended questions reveal problem solving skills and also highlight blind spots.
quillbreaker
@quillbreaker I agree about open-ended questions, but if they don't have the sense to provide a proper way to answer said questions ("400 word limit"), then they aren't holding up their end of the deal.
shambleh
The 400 word limit may seem a weird requirement, but good communications is an invaluable skill for some recruiters. Being optimistic, I want to think they do it for that reason.
Chubas
+3  A: 

Sounds like a very clever way of getting free consulting!

On the other hand, the "max performance at minimum hardware / software-licensing" is a bogus requirement. There is no way without a ton of additional context that you could answer that question realistically.

If I were actually going to take the question seriously, I would add my own set of reasonable performance requirements and use those as a guide. The interviewer should not really care what values you pick as long as your solution is consistent.

Darrel Miller
+2  A: 

The problem with this type of question is that it could get you the job without testing if you could actually write the code to make this happen! Who cares if you can answer a question like this for a Senior developer, the question should give you this information and ask you to code a portion of it!

Molex
+8  A: 

I don't know your current situation but faced with this I would probably print out this test, crumple it up, and throw it in the garbage.

If it was somewhere I was interested in working at, I'd be a bit bewildered but I'd play ball. I've asked questions similar to this in in-person interviews - but instead of asking for an essay response I have the candidate sketch out a logical diagram of the system, hoping that he explains each component along the way. The way he describes it can fit into the passion that Joel talks about.

As far as getting into details of licensing and such, I'd brush that off with broad answers - "that's expensive", "that's free." If they kept pushing, I would ask how this is relevant to whether or not my skill set is what they are looking for.

bpapa
Agree with all but the printing part. It's not worth the waste of paper or toner. ;-)
Huntrods
+1  A: 

Given this question was passed to you via a recruiter it does sound a little excessive for the initial candidate screening I'd assume it's trying to achieve.

The request volumes are unrealistic (unless this was actually Twitter or Google), the budget aspects unnecessary and while it might be an interesting face-to-face interview question I think (at what I suspect is an early stage in the recruitment process) it's overkill to say the least.

Could well be indicative of the company as a whole and would certainly be a warning sign to me.

Nick Holt
@Nick: or Amazon :-)
Stephen C
@Stephen - yeah there's a few out there with crazy volumes these days but one things for sure, they didn't build their systems explicitly to handle such huge volumes. Instead they adapted their systems as the volumes grew - see http://highscalability.com/youtube-architecture for an example of this.
Nick Holt
@Nick: Ummm .... I've been there. In the case of Amazon, they really do build some systems to handle crazy big volumes from day 1. In fact the problem reminds me a lot of a particular Amazon system ...
Stephen C
+1  A: 

"senior software engineer" at my company is a bright 27yo to a mediocre 33yo and making roughly 90-100k. the people solving problems like that are a couple levels higher.

Dustin Getz
+5  A: 

Just for fun, here is my solution:

Assumptions: It is assumed that the dictionary don't change often, and that there thus is no need to optimize the process of inserting new words in the dictionary.

Solution: Use a number of dual processor/quad core servers running Tomcat to serve the http requests. The dictionary is small, so you can just keep the entire dictionary in ram with a ConcurrentHashMap. You also need a backend database server, which the user can use to the dictionary. This server should then push updates out to the frontend tomcat servers.

I Estimate that each core can serve 100 requests/second*, so you get 800 requests per server. You thus need something like 40 servers.

*Real number may be much higher, but the right answer does depend on if the requests come from 100 different users, or it's the same user sending 100 requests with keep alive.

Needed hardware. Load-balancer 40 Frontend servers. Price: Don't have a clue, ask your $HARDWARE_VENDOR for a quote. License cost: 0;

Warning signs: The use of the word transactions does seem out of place. I don't know anyone who would call a http get/response for a transaction.

Martin Tilsted
If the dictionary is immutable, then a simple immutable HashMap would solve it. You don't need a ConcurrentHashMap.
notnoop
+3  A: 

Assuming that the company has challenging problems and is trying to recruit smart developers who can think at the systems level as well as write programs ... this is a really good interview question.

I had questions like this when I interviewed for my last job. It was a great job!

Stephen C
+2  A: 

Write a program to read the dictionary and generate a bunch of static html files with the word / definition. The name of each file would be the word they wanted to look up.

Write a jsp that simply forwards to the html document according to the url.

One server should be able to handle the load.

If they have a problem with that solution tell them their requirements should be more specific.

ScArcher2
You could even do this with all static html pages and a little bit of javascript.....
ScArcher2
A: 

You don't discuss any other questions, but given the "fairly long quiz" statement, it sounds like some generic "all inclusive" quiz they send out to everyone who's resume didn't hit the shredder on round one.

It does NOT sound like a particularly "high quality" employer, IMO.

As others have said - unless you are in need of this job, I'd just toss the thing and reply with "I don't do 'fishing expedition' quizzes". If enough job seekers would refuse to go along with this stuff, it might help putting and end to this stuff.

Cheers,

-R

Huntrods
A: 

I would hope that there is more to your interview quiz than a list of questions you could easily answer if you had a copy of the language specification handy.

Programming is fundamentally a problem-solving discipline. To check someone's problem solving skills, you need questions that are actual problems, not trivia questions about language features. With a truly open-ended question, the canidate may come up with a good answer that you hadn't even thought of. That's not evidence of a bad question, it's evidence of a good canidate.

quillbreaker
A: 

Feel free to cut and paste my answer:

Average loads are not what kills web sites, it's peak loads which govern site design - and the expected peak loads were was not specified. Nor was the length of the key word in the querystring, which makes the rest of the calculations unsolvable. As a senior whatever it's my job to ask questions to lock down the details and provide precise designs and risk factors for real systems of business value, not to provide useless back-of-the-envelope designs for underspecified, contrived toy systems. And you can quote me on that.

extra credit: ask them if they've considered using javascript for that ;-)

Steven A. Lowe
A: 

Clearly, there must be a component which can handle over 30,000 queries per second from dictionary of 50 million words. If this is totally in memory, this component will cost about 6G memory. So use a cache, leave the most frequent items in memory and others in file.

Then there comes the big problem which need to handle more than 30,000 http requests per second. There will be lots of solutions...

DeepNightTwo
+1  A: 

Not all interview questions were meant to be solved. Sometimes questions are just there to see what you know, how you think, etc.

Just answer truthfully and professionally (including, of course, the issues that seem impossible) and hope your mentality matches theirs.

Neko
A: 

I would consider this to be a "how would this guy tackle a tough assignment where the requirements are chosen so high to be almost certain that it was unknown territory."

In other words, how do you extrapolate on your current experience level.

Thorbjørn Ravn Andersen
A: 

Sounds like a job for a fast in-memory key-value store like Redis, plus sharding if you need to use memory-constrained computers, plus whatever front-end is most convenient. Cluster if necessary. You could get better performance by getting fancy, but there's something to be said for getting a working solution, fast.

Peter Scott