views:

629

answers:

6

I have a file containing 800 lines like:

id       binary-coded-info
---------------------------
4657     001001101
4789     110111111
etc.

where each 0 or 1 stands for the presence of some feature. I want to read this file and do several bitwise logical operations on the binary-coded-info (the operations depend on user input and on info from a second file with 3000 lines). Then, these re-calculated binary-coded-info should be written to file (with trailing zeros, e.g.

4657     000110011
4789     110110000
etc.

How should I do this without writing my own base conversion routine? I am open for anything, also languages I do not know, like python, perl, etc. And it should work without compiling.

So far, I tried to script, awk and sed my way. This would mean (I think): batch read as base-2, convert to base-10, do bitwise operations depending on user input and second file, convert to base-2, add leading zeros and print. The usual console hints to use bc do not seem elegant because I have many lines in a file. The same holds for dc.sed. And awk doesnt seem to have an equivalent to flagging input as binary ( as in "echo $((2#101010))" ), and also, the printf trick doesn't work for binary. So, how would I do this most elegantly (or, at all, for that matter) ?

A: 

In C, you can use "strtol(str, NULL, 2)" to do the conversion, if you're already doing this in C.

Something like the following would work:

FILE* f = fopen("myfile.txt", "r");
char line[1024];
while ((line = fgets(line, sizeof(line), f))
{
  char* p;
  long column1 = strtol(line, &p, 10);
  long column2 = strtol(p, &p, 2);
  ...
}

You'll need to add error-handling, etc.

Rick C. Petty
+3  A: 

Why convert them and use bit operations?

In Python, you can do all of this as a string.

for line in myFile:
    key, value = line.split()
    bits = list(value)
    # bits will be a list of 1-char strings ['1','0','1',...]
    # ... do stuff to bits ...
    print key, "".join( value )
S.Lott
You can also convert the items in bits to bools with: bits = [n!='0' for n in bits] This way you can just say if bits[2]:...
FogleBird
+1  A: 

In python, you can convert to binary using int, specifying base 2. ie:

>>> int('110111111',2)
447

To convert back, there is a bin function in python2.6 or 3, but not in python2.5, so you'd need to implement it yourself (or use something like the below):

def bin(x, width):
    return ''.join(str((x>>i)&1) for i in xrange(width))[::-1]

>>> bin(447, 9)
110111111

(The width is the number of digits to pad to - your examples seem to be using 9-bit numbers.)

Brian
A: 

"Simple" Perl one liner (replace foo bar baz quux with your flags

perl -le '@f=qw/foo bar baz quux/;$_&&print($f[$i]),$i++for split//, shift' 1011

Here is a readable Perl version:

#!/usr/bin/perl

use strict;
use warnings;

#flags that can be turned on and off, the first 
#flag is turned on/off by the left-most bit
my @flags = (
    "flag one",
    "flag two",
    "flag three",
    "flag four",
    "flag five",
    "flag six",
    "flag seven",
    "flag eight",
);

#turn the command line argument into individual
#ones and zeros
my @bits = split //, shift; 

#loop through the bits printing the flag that
#goes with the bit if it is 1
my $i = 0;
for my $bit (@bits) {
    if ($bit) {
     print "$flags[$i]\n";
    }
    $i++;
}
Chas. Owens
A: 

Expanding on Brian's answer:

# Get rid of the '----' line for simplicity
data_file = '''id       binary-coded-info
4657     001001101
4789     110111111
'''
import cStringIO
import csv, sys
data = [] # A list for the row dictionaries (we could just as easily have used the regular reader method)
# Read in the file using the csv module
# Each row will be a dictionary with keys corresponding to the first row
reader = csv.DictReader(cStringIO.StringIO(data_file), delimiter=' ', skipinitialspace = True)
try:
    for row in reader:
        data.append(row) # Add the row dictionary to the data list
except csv.Error, e:
    sys.exit('file %s, line %d: %s' % (filename, reader.line_num, e))
# Do something with the bits
first = int(data[0]['binary-coded-info'],2) # First bit string
assert(first & int('00001101',2) == int('1101',2)) # Bitwise AND
assert(first | int('00001101',2) == int('1001101',2)) # Bitwise OR
assert(first ^ int('00001101',2) == int('1000000',2)) # Bitwise XOR
assert(~first == int('110110010',2)) # Binary Ones Complement
assert(first << 2 == int('100110100',2)) # Binary Left Shift
assert(first >> 2 == int('000010011',2)) # Binary Right Shift

See the python docs on expressions for more information and csv module for more information on the csv module.

A: 

Hi, thanks for all of your answers! I started learning python and will hopefully look into perl alongside. This was a great starting point! - Matt