ansaurus

Question

std::pair<int, int> vs struct with two int's

Answer 1

+18 A:

std::pair<int, int>::pair() constructor initializes the fields with default values (zero in case of int) and your struct Cell doesn't (since you only have an auto-generated default constructor that does nothing).

Initializing requires writing to each field which requires a whole lot of memory accesses that are relatively time consuming. With struct Cell nothing is done instead and doing nothing is a bit faster.

sharptooth 2009-10-22 12:35:16

does it take 1.5 seconds to set about 8 MB's to zero?

Etan 2009-10-22 12:37:40

Don't forget cache misses.

sharptooth 2009-10-22 12:38:39

No, the problem is the function calls to the constructor. If your array is global, it's initialized to zero anyway. Probably the compiler doesn't optimize out the constructor calls.

Johannes Schaub - litb 2009-10-22 12:38:48

You need to call the constructor, find the memory, set it, etc... it's not just setting N bits of contigous memory to 0 via memset.

Lemurik 2009-10-22 12:40:33

@Lemurik: I’m not convinced. All the `pair` constructor does is delegating to the constructors of its (POD) members. A compiler could (and IMHO, *should*) recognize this, elide the constructor call and simply zero out the memory. The analysis to do this surely cannot be hard?

Konrad Rudolph 2009-10-24 09:14:51

… alas, it doesn’t (just tested it). :-(

Konrad Rudolph 2009-10-24 09:21:14

Answer 2

+1 A:

My guess it's the way std::pair is created. There is more overhead during invoking a pair constructor 1001x1001 times than when you just allocate a memory range.

Ashalynd 2009-10-22 12:45:08

Answer 3

+1 A:

it is really a good example about one should write C++ and use STL carefully. any thoughts?

my project is working on a C&C++ code level benchmark test tool in which we'll make lots of code samples to found out what is "good" code and what is a "bad" coding habit. see http://effodevel.googlecode.com to learn more about C9B.M. planning. anyone if you had experienced lots of such cases, please kindly join the project to help us.

EffoStaff Effo 2009-10-22 12:47:27

This isn't an answer to the question.

Brian Neal 2009-10-22 17:44:09

Any thoughts? Yes. Your project seems to buy into the general idea that optimization is all about *low-level optimization* (and big-O). I suggest the problem is much bigger than that, namely *galloping generality*, and that you might consider a broader scope.http://stackoverflow.com/questions/926266/performance-optimization-strategies-of-last-resort/927773#927773

Mike Dunlavey 2009-10-24 15:20:31

no plan to do a bigger one yet presently. i'd make progress step by step, such as code level, algo level and architecture level, and so on. we're understanding language and compilers now. thanks for your comments

EffoStaff Effo 2009-10-25 05:30:07

Let me suggest, when you want to be more ambitious, you discard any prior ideas of what the higher levels might be. Rather, scrounge around your organization for real (large) code that has performance problems, and learn from them. That's how I found out that worrying about low-level optimization is just like looking for one's keys under the street light because that's where the light is.

Mike Dunlavey 2009-10-28 01:56:25

i have no performance problems with my software (80K LOC); so your problem is not my point. what we are making is that a tool helping develops to aware of and then understand what is "good" code and what is a "bad" coding habit, say a coding guideline. that's all.

EffoStaff Effo 2009-10-28 05:50:39

Answer 4

+6 A:

The answers so far don't explain the full magnitude of the problem.

As sharptooth has pointed out, the pair solution initializes the values to zero. As Lemurik pointed out, the pair solution isn't just initializing a contiguous block of memory, instead it is calling the pair constructor for every element in the table. However, even that doesn't account for it taking 1.5 seconds. Something else is happening.

Here's my logic:

Assuming you were on an ancient machine, say running at 1.33ghz, then 1.5 seconds is 2e9 clock cycles. You've got 2e6 pairs to construct, so some how each pair constructor is taking 1000 cycles. It doesn't take 1000 cycles to call a constructor that just sets to integers to zero. I would believe it if the number was less than 100 cycles.

I thought it would be interesting to see where else all these CPU cycles are going. I used the crappiest oldest C++ compiler I could find to see if I could attain the level of wastage required. That compiler was VC++ v6. In debug mode, it does something I don't understand. It has a big loop that calls the pair constructor for each item in the table - fair enough. That constructor sets the two values to zero - fair enough. But just before doing that, it sets all the bytes in a 68 byte region to 0xcc. That region is just before the start of the the big table. It then overwrites the last element of that region with 0x28F61200. Every call of the pair constructor repeats this. Presumably this is some kind of book keeping by the compiler so it knows which regions are initialized when checking for pointer errors at run time. I'd love to know exactly what this is for.

Anyway, that would explain where the extra time is going. Obviously another compiler may not be this bad. And certainly an optimized release build wouldn't be.

Andrew Bainbridge 2009-10-23 23:49:09

+1. congrats to retagging permissions

Etan 2009-10-24 00:41:40

Answer 5

+1 A:

These are all very good guesses, but as everyone knows, guesses are not reliable.

I would say just randomly pause it within that 1.5 seconds, but you'd have to be pretty quick. If you increased each dimension by a factor of about 3, you could make it take more like 10+ seconds, so it would be easier to pause.

Or, you could get it under a debugger, break it in the pair constructor code, and then single step to see what it is doing.

Either way, you would get a firm answer to the question, not just a guess.

Mike Dunlavey 2009-10-24 15:07:34

ansaurus

tags:

views:

answers:

std::pair<int, int> vs struct with two int's

related questions