Have you ever wanted to test and quantitatively show whether your application would perform better as a static build or shared build, stripped or non-stripped, upx or no upx, gcc -O2 or gcc -O3, hash or btree, etc etc. If so this is the thread for you. There are hundreds of ways to tune an application, but how do we collect, organize, process, visualize the consequences of each experiment.
I have been looking for several months for an open source application performance engineering/profiling framework similar in concept to Mozilla's Perftastic where I can develop/build/test/profile hundreds of incarnations of different tuning experiments.
Some requirements:
Platform
SUSE32 and SUSE64
Data Format
Very flexible, compact, simple, hierarchical. There are several possibilities including
- Custom CSV
- RRD
- Protocol Buffers
- JSON
- No XML. There is lots of data and XML is tooo verbose
Data Acquisition
Flexible and Customizable plugins. There is lots of data to collect from the application including performance data from /proc, sys time, wall time, cpu utilization, memory profile, leaks, valgrind logs, arena fragmentation, I/O, localhost sockets, binary size, open fds, etc. And some from the host system. My language of choice for this is Python, and I would develop these plugins to monitor and/or parse data in all different formats and store them in the data format of the framework.
Tagging
All experiments would be tagged including data like GCC version and compile options, platform, host, app options, experiment, build tag, etc.
Graphing
History, Comparative, Hierarchical, Dynamic and Static.
- The application builds are done by a custom CI sever which releases a new app version several times per day the last 3 years straight. This is why we need a continuous trend analysis. When we add new features, make bug fixes, change build options, we want to automatically gather profiling data and see the trend. This is where generating various static builds is needed.
- For analysis Mozilla dynamic graphs are great for doing comparative graphing. It would be great to have comparative graphing between different tags. For example compare N build versions, compare platforms, compare build options, etc.
- We have a test suite of 3K tests, data will be gathered per test, and grouped from inter-test data, to per test, to per tagged group, to complete regression suite.
- Possibilities include RRDTool, Orca, Graphite
Analysis on a grouping basis
- Min
- Max
- Median
- Avg
- Standard Deviation
- etc
Presentation
All of this would be presented and controlled through a app server, preferably Django or TG would be best.