views:

344

answers:

4

Hey, what's the easiest way to convert a C++ std::string to another std::string, which has all the unprintable characters escaped?

for example for the string of length two which the content 0x61,0x01 - the result string might be "a\x01", or "a%01"

Just looking for the easiest already-implemented solution. Specific output format is less important.

Thanks, Dan.

+2  A: 

One person's unprintable character is another's multi-byte character. So you'll have to define the encoding before you can work out what bytes map to what characters, and which of those is unprintable.

Douglas Leeder
+4  A: 

Take a look at the Boost's String Algorithm Library. You can use its is_print classifier (together with its operator! overload) to pick out nonprintable characters, and its find_format() functions can replace those with whatever formatting you wish.

#include <iostream>
#include <boost/format.hpp>
#include <boost/algorithm/string.hpp>

struct character_escaper
{
    template<typename FindResultT>
    std::string operator()(const FindResultT& Match) const
    {
        std::string s;
        for (typename FindResultT::const_iterator i = Match.begin();
             i != Match.end();
             i++) {
            s += str(boost::format("\\x%02x") % static_cast<int>(*i));
        }
        return s;
    }
};

int main (int argc, char **argv)
{
    std::string s("a\x01");
    boost::find_format_all(s, boost::token_finder(!boost::is_print()), character_escaper());
    std::cout << s << std::endl;
    return 0;
}
Josh Kelley
A: 

Assumes the execution character set is a superset of ASCII and CHAR_BIT is 8. For the OutIter pass a back_inserter (e.g. to a vector<char> or another string), ostream_iterator, or any other suitable output iterator.

template<class OutIter>
OutIter write_escaped(std::string const& s, OutIter out) {
  *out++ = '"';
  for (std::string::const_iterator i = s.begin(), end = s.end(); i != end; ++i) {
    unsigned char c = *i;
    if (' ' <= c and c <= '~' and c != '\\' and c != '"') {
      *out++ = c;
    }
    else {
      *out++ = '\\';
      switch(c) {
      case '"':  *out++ = '"';  break;
      case '\\': *out++ = '\\'; break;
      case '\t': *out++ = 't';  break;
      case '\r': *out++ = 'r';  break;
      case '\n': *out++ = 'n';  break;
      default:
        char const* const hexdig = "0123456789ABCDEF";
        *out++ = 'x';
        *out++ = hexdig[c >> 4];
        *out++ = hexdig[c & 0xF];
      }
    }
  }
  *out++ = '"';
  return out;
}
Roger Pate
Ben Voigt
You can use *and* without a header in standard C++ too. This was copied from another project and I forgot to change those to make up for MSVC's deficiencies.
Roger Pate
A: 

Have you seen the article about how to Generate Escaped String Output Using Spirit.Karma?

hkaiser