tags:

views:

899

answers:

7

How to print the bit representation of a string

std::string = "\x80";

void print (std::string &s) {

    //How to implement this
}
+3  A: 

Try:

#include <iostream>

using namespace std;

void print(string &s) {
  string::iterator it; 
  int b;

  for (it = s.begin(); it != s.end(); it++) {
    for (b = 128; b; b >>= 1) {
      cout << (*it & b ? 1 : 0); 
    }   
  }
}

int main() {
  string s = "\x80\x02";
  print(s);
}
Stephan202
functional, but assumes CHAR_BIT == 8. Could be slightly more portable. Also likely could leverage std::for_each to make it more concise.
Evan Teran
You're right. I'll vote your answer up. (I'm a novice to C++; in C a char is always 1 byte wide).
Stephan202
Thanks for the +1, but you are mistaken... both C and C++ say that sizeof(char) == 1. However, it says nothing to guarantee that a char is 1 byte big. Which is why CHAR_BIT exists (and does in C's <limits.h> too).
Evan Teran
Don't you mean that while a char is always 1 byte, it is not guaranteed that a byte always contains 8 bits? I mean, sizeof returns numbers whose unit is byte, right?
Stephan202
yes, i mis-typed. neither C or C++ guarantee that a byte is 8-bits big. the units of sizeof is in chars, not bytes though :-P. (hence sizeof(char) == 1 by definition).
Evan Teran
Good, it's clear to me now :) Thanks!
Stephan202
+4  A: 

Little-endian or big-endian?

for (int i = 0; i < s.length(); i++)
    for (char c = 1; c; c <<= 1) // little bits first
        std::cout << (s[i] & c ? "1" : "0");
for (int i = 0; i < s.length(); i++)
    for (unsigned char c = 0x80; c; c >>= 1) // big bits first
        std::cout << (s[i] & c ? "1" : "0");

Since I hear some grumbling about portability of assuming that a char is a 8-bit byte in the comments of the other answers...

for (int i = 0; i < s.length(); i++)
    for (unsigned char c = ~((unsigned char)~0 >> 1); c; c >>= 1)
        std::cout << (s[i] & c ? "1" : "0");

This is written from a very C-ish standpoint... if you're already using C++ with STL, you might as well go the whole way and take advantage of the STL bitset functionality instead of playing with strings.

ephemient
why don't you do it the simple way? with this "tricky" thing "~(~(unsigned char)0 >> 1)", you get rid of one implementation defined behavior (assuming CHAR_BIT == 8) just to make use of another one (shifting of negative value). ~(unsigned char)0 will become -1 on a two's complement machine.
Johannes Schaub - litb
and if >> sign extends, you will end up doing ~-1 which will become 0 on a two's complement machine. i would just do it simply 1 << (CHAR_BIT -1) works and is much more readable IMHO.
Johannes Schaub - litb
(reason is because ~ promotes its operand first. so the unsigned char becomes an int, and the 0 then is ~'ed and on a two's complement machine, that will become -1 finally). well i try to avoid all bit-sex because it's easy to make mistakes. so i just use std::bitset and be done with it :)
Johannes Schaub - litb
`>>` doesn't sign-extend because the type is unsigned -- that's what the `(unsigned char)` is for, and the ~ does not promote it to a `signed int` (why do you think it does?). IMO it's easier to comprehend this than the `CHAR_BIT` shifting, but who cares -- with constant folding, it's all the same.
ephemient
Interesting -- the type promotion behavior seems to depend on whether I use a C compiler or a C++ compiler. In any case, I saw the C tag and went for that instead of properly using STL, which is a better option if available. I'll add that to my answer.
ephemient
+1  A: 

I am sorry I marked this as a duplicate. Anyway, to do this:

void printbits(std::string const& s) {
   for_each(s.begin(), s.end(), print_byte());
}

struct print_byte {
     void operator()(char b) {
        unsigned char c = 0, byte = (unsigned char)b;
        for (; byte; byte >>= 1, c <<= 1) c |= (byte & 1);
        for (; c; c >>= 1) cout << (int)(c&1);
    }
};
dirkgently
+3  A: 

expanding on Stephan202's answer:

#include <algorithm>
#include <iostream>
#include <climits>

struct print_bits {
    void operator()(char ch) {
     for (unsigned b = 1 << (CHAR_BIT - 1); b != 0; b >>= 1) {
      std::cout << (ch & b ? 1 : 0); 
     }
    }
};

void print(const std::string &s) {
    std::for_each(s.begin(), s.end(), print_bits());
}

int main() {
    print("\x80\x02");
}
Evan Teran
+5  A: 

I'd vote for bitset:

void pbits(std::string const& s) { 
    for(std::size_t i=0; i<s.size(); i++) 
        std::cout << std::bitset<CHAR_BIT>(s[i]) << " "; 
} 

int main() {
    pbits("\x80\x70"); 
}
Johannes Schaub - litb
I'd vote for bitset too. Ooh, hey, I just did :-)
SCFrench
+3  A: 

Easiest solution is next:

const std::string source("test");
std::copy( 
 source.begin(), 
 source.end(), 
 std::ostream_iterator< 
  std::bitset< sizeof( char ) * 8 > >( std::cout, ", " ) );
  • Some stl implementations allow std::setbase() manipulator for base 2.
  • You could write your own manipulator if want most flexible solution than existing.

EDIT:
Oops. Someone already posted similar solution.

bb
still, +1 for using an STL algorithm.
SCFrench
A: 

If you want to do it manually, you can always use a lookup table. 256 values in a static table is hardly a lot of overhead:

static char* bitValues[] = 
{
"00000000",
"00000001",
"00000010",
"00000011",
"00000100",
....
"11111111"
};

Then printing is a simple matter of:

for (string::const_iterator i = s.begin(); i != s.end(); ++i)
{
    cout << bitValues[*i];
}
Eclipse