tags:

views:

537

answers:

3

I am using RRDTool (http://oss.oetiker.ch/rrdtool/) as a graphing back-end for storing performance metrics. This is done via the RRDTool CLI from a Python script.

My problem is that the script is multithreaded and each thread updates the RRD at a pretty rapid pace. Sometimes an update fails because one thread is accessing the RRD file while another one tries to access it also.

I was under the impression that this is OK to try since RRDTool uses its own locking mechanism, but I guess that isn't true.

Does anyone have a good approach for concurrent access to an RRD?

I can think of a few ways to go:

  1. have 1 thread create a queue and only feed the RRD from a single thread.

  2. create my own locking mechanism inside the Python script. (how would I do this?)

got anything better or have you run into this issue before?

+1  A: 

An exclusive lock ought to be enough for this problem :

Define your lock object at the main level, not at the thread level, and you're done.

Edit in Response to comment :

if you define your lock (lock = new Lock()) at the thread level, you will have one lock object per running thread, and you really want a single lock for the file rrdtool updates, so this definition must be at the main level.

Johan Buret
why not at the thread level?
Corey Goldberg
A: 

You could also try using rrdcached to do the updates. Then all write updates will be serialised through rrdcached. When you want to read the RRD to generate graphs you tell the daemon to flush it and the on-disk RRD will then represent the latest state.

All the RRD tools will do this transparently if pointed at the cached daemon via an environment variable.

stsquad
+1  A: 

This thread in rrd-users list may be useful. The author of rrdtool states that its file locking handles concurrent reads and writes.

lfagundes