views:

165

answers:

4

I'm building an integration test for a web application that has multiple interdependent services. All of them depend on a shared resource in order to run correctly. I'd like to make sure that the data in the system is sane when its live so I'm leveraging a live service. I'm using Python to build it and this is my idea on how to sandbox the services:

  • build a test runner using multiprocessing's BaseManager
  • chroot jail each of the services, run them as a background service
  • have a listener respond to incoming connections from the services and spit out the data

Does this seem sane? Other ideas include running each service as a process or make each service have its own python virtualenv to run in.

A: 

The third possibility is the easiest - avoid any locking issues by having your own daemon be the intermediate and have it be the only process with direct access to the resource, and all other processes need to go through it to get access.

pjz
A: 

One of the easiest thing is to have a pool of python processes which serve a request, then terminate and get relaunched by a shell script. This small library http://codespeak.net/execnet/ provide a very minimal script to create server which listen for a request and then exit.

Take a look to this recipe: Instantiate gateways through sockets I have used it to build a simple cluster of agnostic python processes. They can execute small python code, and if you chroot jail each of the services, you can gain a good level of isolation.

By the way, I suggest you to avoid to have different privileges (sandobox) on the same python process. It is unpratical: for instance years ago Zope/Plone have a lot of problem when it is hosted, because a bad designed plugin can take down an entire big site.

Python processes are fast to fire and shutdown, and operating system can deal the dynamic load in a better way then us application code :)

daitangio
A: 

You definitely don't want to test with live data first. In order to build the integration test, you should first mock your dependencies and use I/O sets you control. Having expected input and output is very important. Building those unit tests will help you immensely when doing your integration testing.

As for your specific question, you can use a proxy to intercept the data or decorate your calling function to add logging. Take a look at Aspect Oriented Programming (AOP) for more information on interceptors.

If you are using WSGI, you can write a middleware piece to handle the interception and logging. Take a look at CherryPy's wsgiserver.py module for help with that; Django also uses middleware and their docs might be able to help regarding middleware.

Scott
A: 

Perhaps you should take a step back and ask a few questions first.

  • What is the most important part to have tested?
  • How difficult is it to setup that test?
  • Is the cost of setting up the test worth getting the test results?
  • Can I have most of what I wanted tested with a simpler test?

Any way you go I would use a fixture based on live data and expectation of what that data becomes. This allows our test to be deterministic and therefore automated.

If the most important piece is a portion of logic, that can be tested via a unit test with known input/output and mocks.

If the testing the integration part is really the most important then I would try and strike a balance between mocking out as many moving pieces as I felt comfortable doing in order to make a more manageable test.

The more networked resources you use the more complex a system, and the more tests it should have. You have to think about timing issues, service uptime, timeouts, error states, etc. You can also fall into a trap of creating a test which is nondeterministic. If your assertion ends up looking for differences in timings, rely on particular timings, or rely on an unreliable service which breaks alot; than you may end up with a test which is worthless because of the amount of "noise" from false positive breaks.

If you want to drive towards using a continuous integration model you'll also need consider the complexities of having to manage (startup and shutdown) or multiple process with each test run. In general you get a easier test to manage if you can have the test be the single process running and the other "processes" be the function calls to the appropriate starting points in the code.

dietbuddha