I have created a prototype of a custom ORM tool using aspect oriented programming (PostSHarp) and achieving persistence ignorance (before compile-time). Now I tried to find out how much overhead does it introduce compared to using pure DataReader and ADO.NET. I made a test case - insert, read, delete data (about 1000 records) in MS SQL Server 2008 and MySQL Community Edition. I run this test multiple times using pure ADO.NET and my custom tool.
I expected that results will depend on many factors - memory, swapping, CPU, other processes so I ran tests for many times (20-40). But the results were really unexpected. They just differed too much between those cases. If there were just some extreme values, I could ignore them (maybe swapping ocurred or smth. like that) but they were so different that I am sure I cannot trust this kind of testing. Almost half of times my ORM showed 10% better performance than pure ADO.NET, other times it was -10%.
Is there any way I can make those tests reliable? I do not have a powerful computer with lots of memory, but maybe I somehow can make MS SQL and MySQL or ADO.NET to be as consistent as possible during those tests? And how about count of records - which is more reliable, using small amount of records and running more times or other way?