Open Source Profiling Frameworks?

Have you ever wanted to test and quantitatively show whether your application would perform better as a static build or shared build, stripped or non-stripped, upx or no upx, gcc -O2 or gcc -O3, hash or btree, etc etc. If so this is the thread for you. There are hundreds of ways to tune an application, but how do we collect, organize, process, visualize the consequences of each experiment.

I have been looking for several months for an open source application performance engineering/profiling framework similar in concept to Mozilla's Perftastic where I can develop/build/test/profile hundreds of incarnations of different tuning experiments.

Some requirements:

Platform

SUSE32 and SUSE64

Data Format

Very flexible, compact, simple, hierarchical. There are several possibilities including

Custom CSV
RRD
Protocol Buffers
JSON
No XML. There is lots of data and XML is tooo verbose

Data Acquisition

Flexible and Customizable plugins. There is lots of data to collect from the application including performance data from /proc, sys time, wall time, cpu utilization, memory profile, leaks, valgrind logs, arena fragmentation, I/O, localhost sockets, binary size, open fds, etc. And some from the host system. My language of choice for this is Python, and I would develop these plugins to monitor and/or parse data in all different formats and store them in the data format of the framework.

Tagging

All experiments would be tagged including data like GCC version and compile options, platform, host, app options, experiment, build tag, etc.

Graphing

History, Comparative, Hierarchical, Dynamic and Static.

The application builds are done by a custom CI sever which releases a new app version several times per day the last 3 years straight. This is why we need a continuous trend analysis. When we add new features, make bug fixes, change build options, we want to automatically gather profiling data and see the trend. This is where generating various static builds is needed.
For analysis Mozilla dynamic graphs are great for doing comparative graphing. It would be great to have comparative graphing between different tags. For example compare N build versions, compare platforms, compare build options, etc.
We have a test suite of 3K tests, data will be gathered per test, and grouped from inter-test data, to per test, to per tagged group, to complete regression suite.
Possibilities include RRDTool, Orca, Graphite

Analysis on a grouping basis

Min
Max
Median
Avg
Standard Deviation
etc

Presentation

All of this would be presented and controlled through a app server, preferably Django or TG would be best.

ansaurus

tags:

views:

answers: