views:

1634

answers:

6

I'm sorry for flaming std::string and std::wstring. They are quite limited and far from being thread safe. Performance wise, they are not that good too. I miss simple features:

  1. Splitting a string into array/vector/list
  2. Simple & intuitive case-insensitive find & replace
  3. Support for i18n without worrying about string or wstring
  4. Conversion to and from int, float, double
  5. Conversion to and from UTF-8, UTF-16 & other encodings
  6. Thread-safe/reentrant
  7. Small footprint & no dependencies
  8. Highly portable & cross-platform

I've found Qt QString to be useful and also found CBString http://bstring.sourceforge.net/bstrFAQ.shtml

Any other suggestions & comparisons? Thank you.

+2  A: 

For conversion, you can always break down and use the C library cstdlib.

#include <cstlib>
#include <iostream>

int main()
{
   std::string num;

   std::cin >> num;

   someFunc(atoi(num));
   someOtherFunc(atof(num));
   return 0;
}

atoi = ascii to integer atof = ascii to float

As for find, use the STL function "find" defined under header algorithm, or find_first_of (or similar). I also believe you can initialize a vector of char's with an std::string, but that is conjecture.

Hooked
Sure, but that's awkward. The functionality really should be available at the C++ library level, just not necessarily in string itself.
Steven Sudit
Thanks but I need case insensitive + i18n support too.
Viet
I'm not sure what you mean. What do you mean in the string itself?
Hooked
Oh, case insensitive. Well, my solution would be to simply copy the string, convert it to lower with the appropriately named tolower() function in a loop, manipulate it, and make the same manipulations to the copied string. You're right, it's convoluted, but it works.
Hooked
Yes or at least dedicated set of functions to do so. Like in PHP: str_replace, str_split, explode, implode ...
Viet
Guys, writing yourself from scratch is not the point! Thanks anyway.
Viet
is this in some way superior to using std::istringstream?
TokenMacGuy
No sir, I simply am more familiar with the C method. I would feel uncomfortable trying to show using stringstream, simply because I might get it wrong. Feel free to edit my post.
Hooked
Both approaches are fine :)
Viet
atoi does not provide any diagnostics or any way to signal error. I'd suggest to use strotol instead.
mloskot
+2  A: 

Bstring - Although I never tried it myself, the feature set and speed presented at their site. Under your choice of GPL or BSD license is also a good degree of freedom.

Also, the name suggests it's better so how can they lie? :)

LiraNuna
Hsieh also wrote SuperFastHash which, it turns out, is not so fast. Murmur2 is faster and better distributed.
Steven Sudit
Thanks. I found this lib on my query :)
Viet
Bstrlib doesn't support Unicode.
anno
Yeah, suitable for English :)
Viet
+3  A: 

I'm not sure I agree. Strings really shouldn't be thread-safe due to the overhead, except for reference-counting, if applicable. Most of the other functionality you want would turn strings into a garbage barge. Likewise, removing dependencies would remove their ability to work well with streams.

The one thing I'd suggest is that we could benefit from an immutable string class, particularly one that has no memory ownership or termination. I've written those before and they can be very helpful.

Steven Sudit
Thanks Steven. But string manipulation really bugs me. I need a lot for string processing in C++. In PHP, Perl and Python I can do things without a sweat but struggle to do it in C++.
Viet
These are scripting languages...
Steven Sudit
Viet
@Steven: Your point being ?
Newtopian
My point being that scripting languages have different goals than general purpose languages. A scripting language such as PHP actively encourages piling on functionality until a class becomes a garbage barge, in the name of convenience. With a GPL, there's so much functionality that you absolutely need to organize it so that it doesn't become overwhelming.
Steven Sudit
This has recently gotten some down-votes. While that's certainly your right, I was curious as to what the reason might be.
Steven Sudit
+5  A: 

The C++ String Algorithms Library from Boost has pretty much all of the features you need.

John T
Thanks John. I've done my homework before posting here. It's neither simple nor intuitive. I failed the splitting feature because it doesn't have an option to return empty strings.
Viet
Ahh, you beat me! I would also add that there are a variety of other Boost libraries related to string manipulation besides the string algorithms. Here's a link(http://www.boost.org/doc/libs?view=category_String) to the category!
TokenMacGuy
Thanks TokenMacGuy :)
Viet
@Viet so you have already taken a look at this, or is it helpful to you? I can assure there are more boost libs (as pointed out by Token) for string manipulation than just the algorithms. You could also roll your own function to process strings and check if they are empty before continuing on to one of these functions.
John T
Yeah, probably. But I find QString simpler and more intuitive. Thanks anyway.
Viet
+2  A: 

I found wxString convenient to use and it has many features. Although it is part of a bigger library (wxWidgets) and maybe just too big when you just want to use strings. It also works without GUI components when you just use wxBase which contains the wxString and a 'few' other components.

EDIT: here is a link to the documentation. It accepts the standard functions of std::string and also a few others. I always find the BeforeFirst() and AfterFirst() very convenient when I have to parse some text. And it is really well documented.

rve
Thank you for your suggestion, rve!
Viet
+6  A: 

The C++ String Toolkit (StrTk) Library is a free library that consists of robust, optimized and portable generic string processing algorithms and procedures for the C++ language. The library is designed to be easy to use and integrate within existing code.

The library has the following capabilities:

* Generic string tokenizer and token iterators
* Split routines
* User specified delimiter and splitter policies (simple and regex based etc.)
* Conversions between data and hex and base-64
* In-place removal and replace routines
* Wild-card matching and globing
* Fast 2D token grid processing
* Extensible string processing templates

and plenty more...

Compatible C++ Compilers:

* GCC 4.0+
* Intel C++ Compiler 9.0+
* Microsoft Visual C++ 8.0+
* Comeau C/C++ 4.1+

Source:

* Download: http://www.partow.net/programming/strtk/index.html
* SVN: http://code.google.com/p/strtk/
Beh Tou Cheh
Good find. Thanks :) I'll have a look.
Viet