views:

59

answers:

3

I need to design a system which has these basic components:

  • A Webserver which will be getting ~100 requests/sec. The webserver only needs to dump data into raw data repository.
  • Raw data repository which has a single table which gets 100 rows/s from the webserver.
  • A raw data processing unit (Simple processing, not much. Removing invalid raw data, inserting missing components into damaged raw data etc.)
  • Processed data repository

Does it make sense in such a system to have a service layer on which all components would be built? All inter-component interaction will go through the service layers. While this would make the system easily upgradeable and maintainable, would it not also have a significant performance impact since I have so much traffic to handle?

+1  A: 

What do you see as the costs of having a separate service layer?

How do those costs compare with the costs you must incur? In your case that seems to be at least

  1. a network read for the request
  2. a database write for raw data
  3. a database read of raw data
  4. a database write of processed data

Plus some data munging.

What sort of services do you have a mind? Perhaps

  • saveRawData()
  • getNextRawData()
  • writeProcessedData()

why is the overhead any more than a procedure call? Service does not need to imply "separate process" or "web service marshalling".

I contend that structure is always of value, separation of concerns in your application really matters. In comparison with database activities a few procedure calls will rarely cost much.

In passing: the persisting of Raw data might best be done to a queuing system. You can then get some natural scaling by having many queue readers on separate machines if you need them. In effect the queueing system is naturally introducing some service-like concepts.

djna
Interesting point about comparison of service layer overhead vs database activities. Never thought in that direction..
Raze2dust
+1  A: 

Personally feel that you might be focusing too much on low level implementation details when designing the system. Before looking at how to lay out the components, assemblies or services you should be thinking of how to architect the system.

You could start with the following high level statements from which to build your system architecture around:

  1. Confirm the technical skill set of the development team and the operations/support team.
  2. Agree on an initial finite list of systems that will integrate to your service, the protocols they support and some SLAs.
  3. Decide on the messaging strategy.
  4. Understand how you will deploy your service/system.
  5. Decide on the choice of middleware (ESBs, Message Brokers, etc), databases (SQL, Oracle, Memcache, DB2, etc) and 3rd party frameworks/tools.
  6. Decide on your caching and data latency strategy.
  7. Break your application into the various areas of business responsibility - This will allow you to split up the work and allow easier communication of milestones during development/testing and implementation.
  8. Design each component as required to meet the areas of responsibility. The areas of responsibility should automatically lead you to decide on how to design component, assembly or service.

Obviously not all of the above will match your specific case but I would suggest that they should at least be given some thought.

Good luck.

Kane
Thanks for the steps.. it makes the way ahead a lot more clearer.
Raze2dust
+2  A: 

Here's what can happen unless you guard against it.

In the communication between layers, some format is chosen, like XML. Then you build it and run it and find out the performance is not satisfactory.

Then you mess around with profilers which leave you guessing what the problem is.

When I worked on a problem like this, I used the stackshot technique and quickly found the problem. You would have thought it was I/O. NOT. It was that converting data to XML, and parsing XML to recover data structure, was taking roughly 80% of the time. It wasn't too hard to find a better way to do that. Result - a 5x speedup.

Mike Dunlavey
we were thinking of REST based APIs for communication... yea this might be a problem area later. Thanks for the heads up!
Raze2dust
why does a service layer imply XML? Or any serialization. Services can be co-located. Distinguish physical deployment decisions (perhaps tiers) and logical separation. For example you could technically have an EJB with local interfaces (pass by reference) and remote interfaces (pass by value with serialization) and web service interface.
djna
@djna: Why does it imply XML or any serialization? It doesn't. It's just that that's what tends to get built. There's really no way to tell in advance that such a thing will be a performance issue except that, in hindsight, it is.
Mike Dunlavey
+1 Ah yes, good advice.
djna