views:

484

answers:

6

I can see that almost all modern APIs are developed in the C language. There are reasons for that: processing speed, low level language, cross platform and so on.

Nowadays, I program in C++ because of its Object Orientation, the use of string, the STL but, mainly because it is a better C.

However when my C++ programs need to interact with C APIs I really get upset when I need to convert char[] types to C++ strings, then operate on these strings using its powerful methods, and finally convert from theses strings to char[] again (because the API needs to receive char[]).

If I repeat these operations for millions of records the processing times are higher because of the conversion task. For that simple reason, I feel that char[] is an obstacle in the moment to assume the C++ as a better c.

I would like to know if you feel the same, if not (I hope so!) I really would like to know which is the best way for C++ to coexist with char[] types without doing those awful conversions. Thanks for your attention.

+1  A: 

I'm not sure what you mean by "conversion", but won't the following suffice for moving between char*, char[], and std::string?

char[] charString = {'a', 'b', 'c', '\0'};
std::string standardString(&charString[0]);
const char* stringPointer(standardString.c_str());
Steve Guidi
Charles Bailey
Not really. For some reason, I thought the compiler would warn that char[] is not convertible to char*, which in retrospect is silly.
Steve Guidi
+3  A: 

If you use std::vector<char> instead of std::string, the underlying storage will be a C array that can be accessed with &someVec[0]. However, you do lose a lot of std::string conveniences such as operator+.

That said, I'd suggest just avoiding C APIs that mutate strings as much as possible. If you need to pass an immutable string to a C function, you can use c_str(), which is fast and non-copying on most std::string implementations.

bdonlan
It should be noted that in practice, there's no known C++ implementation in which std::string wouldn't use contiguous storage for string data. Because of that, C++0x will explicitly require that in the standard, and in general, it is safe to assume that this will hold for all existing and new implementations.
Pavel Minaev
However, std::string still doesn't offer a way to get a _mutable_ pointer to its underlying data, afaik
bdonlan
Using std:vector, the underlying storage will *probably* be an array, but there is no guarantee in the spec that it will be.
Christopher
@Christopher: C++03 was amended to include that guarantee: http://herbsutter.wordpress.com/2008/04/07/cringe-not-vectors-are-guaranteed-to-be-contiguous/
bdonlan
@Christopher: That C++98 didn't guarantee this, was seen as an oversight by all involved in this part of the standardization, so it was later fixed. It was always save to use `std::vector` in this way.
sbi
+1 for c_str(); all it does is return the underlying char* that an std::string uses, and therefore there is almost no overhead (associated function call alone).
Hooked
@Hooked: I don't think we established that it's currently guaranteed that `std::string` saves its data in a contiguous piece of memory (but see the discussion under jalf's answer). However, no other implementation sis known, so even if it isn't guaranteed, in practice you're right.
sbi
A: 

I don't think it's as bad as you make it out to be.

There is a cost converting a char[] to a std::string, but if you're going to be modifying the string, you have to pay that cost anyway whether converting to a std::string or copying to another char[] buffer.

The conversion going the other way (via string.c_str()) is usually trivial. It's usually returning a pointer to an internal buffer (just don't give that buffer to code that will modify it).

Ferruccio
I forgot to mention that I also interact with C++ libraries and that is another reason for the conversion, I read c strings, transform to c++ string to invoke c++ external functions. And I feel very bad to see that most of my c++ code contains these conversions. Somebody said me hey!! forget c++ and do it in c!!
A: 

I'm not sure why you would be constrained to using C strings and still have an environment that runs C++ code but if you really don't want the overhead of conversion, than don't convert. Just write routines that operate on the C strings.

Another reason for converting to C++ style strings is for bound safety.

Mike
Except C++ strings are far easier to manipulate...?
GMan
@GMan Thanks for pointing out that out. The main point I was trying to get across with that second paragraph is that the poster didn't consider safety, which is a big one IMHO
Mike
A: 

"... because it is a better C."

Baloney. C++ is a vastly inferior dialect of C. The problems it solves are trivial, the problems it brings, much worse than those it solves.

smcameron
...as can especially seen when it has problem interaction with a C API that uses a 40 years old technology to pass strings to and fro? Sorry, but that's silly.
sbi
+4  A: 

The C++ string class has a lot of problems, and yes, what you're describing is one of them.

More specifically, there is no way to do string processing without creating a copy of the string, which may be expensive.

And because virtually all string processing algorithms are implemented as class members, they can only be used on the string class.

A solution you might want to experiment with is the combination of Boost.Range and Boost.StringAlgo.

Range allows you to create sequences out of a pair of iterators. They don't take ownership of the data, so they don't copy the string. they just point to the beginning and end of your char* string.

And Boost.StringAlgo implements all the common string operations as non-member functions, that can be applied to any sequence of characters. Such as, for example, a Boost range.

The combination of these two libraries pretty much solve the problem. They let you avoid having to copy your strings to process them.

Another solution might be to store your string data as std::string's all the time. When you need to pass a char* to some API functoin, simply pass it the address of the first character. (&str[0]).

The problem with this second approach is that std::string doesn't guarantee that its string buffer is null-terminated, so you either have to rely on implementation details, or manually add a null byte as part of the string.

jalf
Regarding the last paragraph, calling str.c_str() guarantees the string is nul terminated and returns a const pointer to the string.
David Smith
Also, taking the address of the buffer is a bad, bad, bad idea. The library makes no guarantees about the buffer at all.. c_str() is the proper approach. However, remember that c_str() is only good until if you leave the string object alone while someone might be using the buffer pointed to by c_str(). If you call any string methods, all bets are off.
Christopher
doesn't the standard guarantee that the buffer is contiguous? Or is that only guaranteed in C++0x?
jalf
@David: Yes, however, `c_str()` returns a `const char*` that can't be written to. that's not always sufficient.
sbi
@jalf: C++98 doesn't guarantee this. I think the universal view nowadays is that it should (probably since no library vendor aver found any advantage in using that freedom -- so practically, all std lib implementations allow this to work). I always copy strings into vectors for this, but it might well be that C++03 introduced that guarantee or that C++1x will -- it's just that I don't know. Might be worth a question on SO, though. `:)`
sbi
Niels Dekker
@thanks sbi. I'm pretty sure C++1x (I liked the 0x name better, but guess we have to get used to this one) will offer this guarantee, but you're probably right that it's not guaranteed by 98 or 03.As for using at() instead, what exactly would that achieve? We don't want an exception to be thrown, we'd like to be able to handle the empty string case elegantly. There's really no universally good solution when storing the data in a string, as this discussion has shown. c_str(), at() and [] all have shortcomings. That's why I suggested using Boost.Range instead and store the memory elsewhere.
jalf