views:

329

answers:

2

I am fairly new to C++, having much more C experience.

I am writing a program that will use the string class, and began to wonder about the efficiency of the "length()" method.

I realized though that I didn't have a good answer to this question, and so was wondering if the answer to this and similar questions exist somewhere. While I am more than capable of determining the runtime of my own code, I'm at a bit of a loss when it comes to provided code, and so I find I can't accurately judge the efficiency of my programs.

Is there c++ documentation (online, or in "man" format) that includes information on the runtime of provided code?

Edit: I'm interested in this in general, not just string::length.

+5  A: 

All of the implementations I've seen are O(1).

The documentation you're looking for is the C++ standard -- I believe C++03 is the latest one at present. It isn't available online or in man format, it's sold commercially. There's a list of the places to find it, and recent prices, here.

Head Geek
+6  A: 

At present, time complexity of size() for all STL containers is underspecified. There's an open C++ defect report for that.

The present ISO C++ standard says that STL containers should have size() of constant complexity:

21.3[lib.basic.string]/2

The class template basic_string conforms to the requirements of a Sequence, as specified in (23.1.1). Additionally, because the iterators supported by basic_string are random access iterators (24.1.5), basic_string conforms to the the requirements of a Reversible Container, as specified in (23.1).

23.1[lib.container.requirements]/5

  • Expression: a.size()
  • Complexity: (Note A)

Those entries marked ‘‘(Note A)’’ should have constant complexity

However, "should" is not a binding requirement in the Standard parlance; indeed, the above applies to std::list as well, but in practice some implementations (notably g++) have O(N) std::list::size().

The only thing that can be guaranteed is that (end() - begin()) for a string is (possibly amortized) O(1). This is because string iterators are guaranteed to be random-access, and random-access iterators are guaranteed to have constant time operator-.

As a more practical issue, for all existing C++ implementations out there, the following holds:

  • std::string::size() is O(1)
  • std::vector::size() is O(1)

They are fairly obvious, as both strings and vectors are most efficiently implemented as contiguous arrays with separately stored size: contiguous because it gives fastest element access while satisfying all other complexity requirements, and storing size is because Container requirements demand that end() be constant-time.

Pavel Minaev
"possibly amortized O(1)". Amortized over what?
Steve Jessop
I'm not sure how to properly phrase it, but I was thinking about `std::deque` - and the general implementation approach (linked list of vectors) when I wrote that; in theory, a standard-compliant string could use deque or something similar as underlying storage. But come to think of it, I'm not even sure that deque iterators have O(1) `operator-` for iterators, amortized or not (though the Standard requires them to). Anyone care to compute the complexity for it?
Pavel Minaev
deque iterators are random access too, and that means they have O(1) operator- by definition. Any implementation that does otherwise is faulty, but you can't draw any other conclusions. It's not too hard to implement this since operator[] needs to be O(1) as well. So, in general you can efficiently calculate x-y as (x-begin())-(y-begin()).
MSalters
I fail to see how `deque` iterators can be O(1), given that to iterate from one to another can require iterating over several linked clusters of items - i.e. in effect iterating over a linked list - which is definitely not O(1). I believe that `operator[]` for `deque` is only _amortized_ O(1) as well, for the same reason.
Pavel Minaev