ansaurus

Question

C++: Simplifying my program to convert numbers to from one base to another.

Answer 1

+4 A:

I don't think you need the inner loop:

for (int i=0; i<starting_base; i++) {

What is its purpose?

Rather, you should get the character at value[ pos ] and convert it to an integer. The conversion depends on base, so it may be better to do it in a separate function.

You are defining char hex[ 16 ] twice, once in each function. It may better to do it at only one place.

EDIT 1: Since this is "homework" tagged, I cannot give you the full answer. However, here is an example of how to_dec() is supposed to work. (Ideally, you should have constructed this!)

Input:

  char * value = 3012, 
  int base = 4,

Math:

Number = 3 * 4^3 + 0 * 4^2 + 1 * 4^1 + 2 * 4^0 = 192 + 0 + 4 + 2 = 198

Expected working of the loop:

  x = 0
  x = 4x + 3 = 3
  x = 4x + 0 = 12
  x = 4x + 1 = 49
  x = 4x + 2 = 198

  return x;

EDIT 2:

Fair enough! So, here is some more :-)

Here is a code sketch. Not compiled or tested though. This is direct translation of the example I provided earlier.

unsigned
to_dec( char * inputString, unsigned base )
{
  unsigned rv = 0; // return value
  unsigned c;      // character converted to integer

  for( char * p = inputString; *p; ++p ) // p iterates through the string
  {
    c = *p - hex[0];
    rv = base * rv + c;
  }

  return rv;
}

ArunSaha 2010-03-04 00:03:30

Could you give any examples of what you're proposing? I'm not sure I understand your suggestion.

Spin City 2010-03-04 01:39:16

I know it's bad form to give people complete answers to homework questions, but I don't think that applies when the homework has already been done and turned in. ;) Anyway, I understand the method of multiplying it (In fact, the first time I scripted something like this I used that method.), but how do you propose eliminating the inner loop? UniquePhoton's method?

Spin City 2010-03-04 05:19:48

+1 for the example. Thanks! Is there any reason you assigned p to inputString? Couldn't you just have used inputString itself there?

Spin City 2010-03-05 21:12:54

Answer 2

+1 A:

You can initialize arrays by string literals (notice that the terminating \0 is not included because the size of the array doesn't permit that):

char const hex[16] = "0123456789ABCDEF";

Or just use a pointer to the string literal for the same effect:

char const* hex = "0123456789ABCDEF";

Tronic 2010-03-04 00:07:32

Is the terminating \0 actually not included or does it merely overflow and corrupt some memory?

Spin City 2010-03-04 00:16:57

You get a compile error if you try to make the array any shorter. The standard has a special rule for \0 being not copied if it doesn't fit.

Tronic 2010-03-04 00:24:07

Interesting. Thank you! What about the pointer? Does that have the trailing null byte?

Spin City 2010-03-04 00:26:54

Cutting off '\0' does not work in C++. It's valid in C only.

Johannes Schaub - litb 2010-03-04 00:27:32

Why does it work in C but not C++?

Spin City 2010-03-04 00:43:12

@Spin, C++'s rationale is "When these non-terminated arrays are manipulated by standard string routines, there is potential for major catastrophe."

Johannes Schaub - litb 2010-03-04 02:16:18

Does that mean that my character array would be terminated with a null byte anyway?

Spin City 2010-03-04 02:48:32

@Spin, no your code is fine. But Tronic's one is invalid.

Johannes Schaub - litb 2010-03-04 14:31:14

Thank you for clearing things up. And... I'm a little confused now. I understand what's happening fine, but I don't understand why it's done this way. If they're worried about non-terminated char arrays causing problems when passed through cstring functions, wouldn't they just terminate all char arrays with nulls for good measure? This approach strikes me as a little... odd.

Spin City 2010-03-04 16:51:41

@Spin City: How do you terminate the `hex[16]` with a null character when it's already got 16 characters in it? The compiler would need to extend it by one position, and that's not something the C++ people wanted to happen. If you specify something inconsistently, that's an error in C++. (It wasn't in C, for other reasons. C and C++ are two distinct languages sharing a fairly large common core, and they should be programmed in distinctly different ways.)

David Thornley 2010-03-04 22:52:51

Right, the inconsistency is what bothers me. The only thing it accomplishes in my eyes is making declaring character arrays more cumbersome.

Spin City 2010-03-05 00:49:12

Answer 3

+1 A:

I would stay away from GOTO statements unless they are absolutely necessary. GOTO statements are easy to use but will lead to 'spaghetti code'.

Try using a loop instead. Something along the lines of this:

bool base_is_invalid = true;
while ( base_is_invalid ) {

    cout << "Enter the number, then starting base, then ending base:" << endl;
    cin >> value >> starting_base >> ending_base;

    if (starting_base < 2 || starting_base > 16 || ending_base < 2 || ending_base > 16)
        cout << "Invalid number. ";
    else
        base_is_invalid = false;

}

emceefly 2010-03-04 00:59:05

I'm sorry. You'll have to explain to me how that's better than what I wrote. All I can see is that you added some totally unnecessary logic and variables to make it do the same thing.

Spin City 2010-03-04 01:22:29

You're right they do exactly the same thing and the while loop is more code. It really makes no difference in a short program like this but as your programs increase in size GOTO statements will make your code unstructured, hard to read, and usually a nightmare to maintain. It's better to get into the habit of not using them especially when you're just starting out with C++.

emceefly 2010-03-04 01:28:41

That kind of "one size fits all" logic bothers me, and I have the distinct feeling you just insinuated I'm an idiot. A statement like "You should never use gotos or else your code will be a mess." operates under the assumption that the coder is too inept to ever be able to use gotos in a clean fashion. Either that or it presupposes that any use of goto is automatically an unclean use which strikes me as even more absurd.

Spin City 2010-03-04 02:01:08

Actually he's totally right. "You should never use gotos or else your code will be a mess." is true. Stop being so stuck up or you'll never learn anything.

Cam 2010-03-04 04:24:39

No, I'm afraid it's when you believe what other people tell you to believe without ever questioning it that you'll never learn anything. I have reason to believe he's wrong, so if you wish to convince me otherwise, you will have to provide me a reason in turn. And if that's called being stuck up, then I will wear that badge with pride.

Spin City 2010-03-04 05:07:13

@Spin City: Edger Dijkstra wrote an article called `A case against the GO TO statement`. He was probably more intelligent that either you or me, so I would heed his advice... and check his reasons. Niklaus Wirth (then editor of Communications of the ACM) wrote the ACM famous letter: `Go To Statement Considered Harmful`

Matthieu M. 2010-03-04 09:57:05

"Please don't fall into the trap of believing that I am terribly dogmatical about [the go to statement]. I have the uncomfortable feeling that others are making a religion out of it, as if the conceptual problems of programming could be solved by a single trick, by a simple form of coding discipline!" -Edsger W. Dijkstra

Spin City 2010-03-04 18:17:38

There is also Donald Knuth's paper: Structured Programming With Go To Statements - http://pplab.snu.ac.kr/courses/adv_pl05/papers/p261-knuth.pdf

Spin City 2010-03-04 18:25:26

My view is that gotos, used judiciously, are fine. And judicious use is not "avoid unless absolutely necessary" (Böhm and Jacopini already proved that essentially means "never use gotos" since any goto statement can be restructured to avoid its use.) but rather "use it whenever it provides for the cleanest and clearest flow control" and when it is reasonable for the sake of efficiency.

Spin City 2010-03-04 18:30:49

How about using a `do...while` loop? `do { ... } while (starting_base < 2 || starting_base > 16 || ending_base < 2 || ending_base > 16);`?

Alok 2010-03-04 18:49:29

@spincity. Ok, go get a dev job and start writing gotos in production c++ code. I guarantee your colleagues will bitch at you. Also, your code is only like 100 lines long, and it's already a mess.

SP 2010-03-04 18:51:53

@Spin City: You actually make a pretty good case about using goto here (it's a very simple program, and it's very apparent what that label + goto are doing). However, one thing to consider is that pretty much all branch control mechanisms are essentially the same other than minor semantics, and a key choice in usage is the *intent* of your branching. Here, your intent is that if you want to keep asking the user to input until it is valid, which implies that you want to loop until a conditional, which is a `do { } while ();` loop.

Tanzelax 2010-03-04 18:56:33

@Alok and @Tanzelax: Ordinarily I would do that, but it gets messy since I also want it to `cout << "Invalid number. ";` if the while statement checks true, and to do that I need an if statement inside the loop instead of being able to use the loop's check. That's when I felt using the loop just became overly circuitous. In the end, the flow control came out neater to my eyes by just using a single if block with a goto.

Spin City 2010-03-04 19:16:43

@SP: That's just trying to peer pressure me into avoiding gotos. And if you think the code is a mess, feel free to post a new answer pointing out which parts specifically are messy and why with suggested improvements?

Spin City 2010-03-04 21:42:38

@Spin City: As long as you're in the early stages of learning, I'd strongly suggest avoiding `goto`s. Once you get a lot of experience in the language, you might find a place where it's indicated. In well over seven years of jobs doing mostly C++ programming, I haven't found one.

David Thornley 2010-03-04 22:23:48

@SP: You are incredibly naïve if you think there are no gotos in any production C++ code. In fact, the production code I'm working on has gotos in it.

dreamlax 2010-03-04 22:26:28

I think it's too hard to make the case for or against GOTOs. It is too subjective and while there are definitely valid cases, why present a case of whether the GOTO is valid or not? Simply make a habit of not using it (even if it is perfectly valid) and in the long run it'll be to your benefit. Use a function or use conditionals and if you follow good naming convention you'll still maintain the same readability. I would love to see an example where a GOTO is the only way to go but I don't think this is an ideal example.

nevets1219 2010-03-04 22:34:09

@nevets1219: Getting out of 7 nested if's is a perfect valid way to use a goto.

Toad 2010-03-05 08:14:24

Can't you make the nested IF a function and return instead? I guess both seems like a valid approach. I guess the function approach isn't that good if you want to follow a strict one exit point only style.

nevets1219 2010-03-05 17:58:28

But here's the thing, nevets1219: You haven't eliminated the goto; your goto is now just a return statement and you wrapped it in a function. That's fine if you had other reasons to make it a function anyway, but otherwise this sort of behavior ranks up with writing `do ... while (false)` so you can `break` for forward jumps. In the end, you're just writing some roundabout code so you can get your goto and convince yourself you didn't actually write a goto.

Spin City 2010-03-05 20:28:17

Don't forget the context of Dijkstra's GOTO debate. BASIC had recently been invented, and the *only* control flow it had, besides FOR...NEXT loops, was GOTO, IF...GOTO, and ON...GOTO. (FORTRAN, on which BASIC was based, had similar deficiencies.) So not only did you have to use GOTO to emulate break and continue, but you also had to use it to emulate if...else, switch, while, and do...while. Programs had a *lot* of GOTOs.And so, when "structured programming" became popular, there was a severe backlash against GOTO.

dan04 2010-03-25 02:45:33

Answer 4

+1 A:

to_dec() looks to complicated, here is my shot at it:

int to_dec(char* value, int starting_base)
{
    int return_value = 0;
    for (char* cur = value + strlen(value) - 1; cur >= value; cur--) {
        // assuming chars are ascii/utf: 0-9=48-57, A-F=65-70
        // faster than loop
        int inval = *cur - 48;
        if (inval > 9) {
            inval = *cur - 55;
            if (inval > 15) {
                // throw input error
            }
        }
        if (inval < 0) {
            // throw input error
        }
        if (inval >= starting_base) {
            // throw input error
        }
        // now the simple calc
        return_value *= starting_base;
        return_value += inval;
    }
    return return_value;
}

UniquePhoton 2010-03-04 01:39:27

Nice. Wouldn't it be better to replace "inval = *cur - 55;" with "inval -= 7;" though?

Spin City 2010-03-04 01:56:40

Yes that is the same, but I find -55 is better for readability.

UniquePhoton 2010-03-04 02:01:25

If you mean for 48 to represent the character '0', just put '0' instead of 48. Keep it readable and let the compiler do the translation. `int inval = *cur - '0';`

indiv 2010-03-04 19:05:22

Answer 5

A:

Apart from the things already mentioned, I would suggest using the new-operator instead of free. The advantages of new are that it also does call constructors - which is irrelevant here since you're using a POD type, but important when it comes to objects such as std::string or your own custom classes - and that you can overload the new operator to suit your specific needs (which is irrelevant here, too :p). But don't go ahead using malloc for PODs and new for classes, since mixing them is considered bad style.

But okay, you got yourself some heap memory in from_dec... but where is it freed again? Basic rule: memory that you malloc (or calloc etc) must be passed to free at some point. The same rule applies to the new-operator, just that the release-operator is called delete. Note that for arrays, you need new[] and delete[]. DON'T ever allocate with new and release with delete[] or the other way around, since the memory won't be released correctly.

Nothing evil will happen when your toy program won't release the memory... I guess your PC has got enough RAM to cope with it and when you shut down your program, the OS releases the memory anyway.. but not all programs are (a) that tiny and (b) shut down often.

Also I'd avoid conio.h, since this is not portable. You're not using the most complicated IO, so the standard headers (iostream etc) should do.

Likewise, I think most programmers using modern languages follow the rule "Only use goto if other solutions are really crippled or tons of more work". This is a situation that can be easily solved by using loops, as shown by emceefly. In your program the goto is easy to handle, but you won't be writing such small programs forever, will you? ;) I, for example, was presented with some legacy code recently.. 2000 lines of goto-littered code, yay! Trying to follow the code's logical flow was almost impossible ("Oh, jump ahead 200 lines, great... who needs context anyway"), even harder was to rewrite the damn thing.

So okay, your goto doesn't hurt here, but where's the benefit? 2-3 lines shorter? Doesn't really matter overall (if you're paid by lines of code, this could also be a major disadvantage ;)). Personally I find the loop version more readable and clean.

As you see, most of the points here can be ignored easily for your program, since it's a toy program. But when you think of larger programs, they make more sense (hopefully) ;)

Christian 2010-03-04 07:30:57

Answer 6

+1 A:

for the initial conversion from ascii to an integer, you can also use a lookup table (just as you are using a lookuptable to to the conversion the other way around) , which is much faster then searching through the array for every digit.

 int to_dec(char value[], int starting_base) 
 {
      char asc2BaseTab = {0,1,2,3,4,5,6,7,8,9,-1,-1,-1,-1,-1,-1,-1,10,11,12,13,14,15, //0-9 and A-F (big caps)
                         -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,  //unused ascii chars
                          10,11,12,13,14,15};  //a-f (small caps)

      srcIdx = strlen(value);
      int number=0;
      while((--srcIdx) >= 0)
      {
           number *= starting_base;          
           char asciiDigit = value[srcIdx];
           if(asciiDigit<'0' || asciiDigit>'f') 
           {
                //display input error
           } 
           char digit = asc2BaseTab[asciiDigit - '0'];
           if(digit == -1)
           {
                 //display input error
           }
           number += digit;
      }
      return number;
 }

p.s. excuses if there are some compile errors in this...I couldn't test it...but the logic is sound.

Toad 2010-03-04 20:38:52

Answer 7

+1 A:

In your description of the assignment as given it says:

"I was asked to use separate functions to convert to and from base 10."

If that is really what the teacher meant and wanted, which is doubtful, your code doesn't do that:

int to_dec(char value[], int starting_base)

is returning an int which is a binary number. :-) Which in my opinion does make more sense.

Did the teacher even notice that?

Harvey 2010-03-23 07:48:48

Answer 8

+1 A:

C and C++ are different languages, and with different styles of programming. You better not to mix them. (Where C and C++ differ)

If you are trying to use C++, then:

Use std::string instead of char* or char[].

int to_dec(string value, int starting_base);
string from_dec(int value, int ending_base);

No any mallocs, use new/delete. But actually C++ manages memory automatically. The memory is freed as soon as variable is out of scope (unless you are dealing with pointers). And pointers are the last thing you need to deal with.

We don't need here any lookup tables, just a magic string.

string hex = "0123456789ABCDEF";//The index of the letter is its decimal value. A is 10, F is 15.
//usage
char c = 'B';
int value = hex.find( c );//works only with uppercase;

The refactored to_dec can be like that.

int to_dec(string value, int starting_base) {
  string hex = "0123456789ABCDEF";
  int result = 0;
  for (int power = 0; power < value.size(); ++power) {
    result += hex.find( value.at(value.size()-power-1) ) * pow((float)starting_base, power);
  }
  return result;
}

And there is a more elegant algorithm to convert from base 10 to any other See there for example. You have the opportunity to code it yourself :)

Draco Ater 2010-03-23 08:34:16

Answer 9

+1 A:

In your from_dec function, you're converting the digits from left to right. An alternative is to convert from right to left. That is,

std::string from_dec(int n, int base)
{
    std::string result;
    bool is_negative = n < 0;

    if (is_negative)
    {
       n = - n;
    }

    while (n != 0)
    {
        result = DIGITS[n % base] + result;
        n /= base;
    }

    if (is_negative)
    {
        result = '-' + result;
    }

    return result;
}

This way, you won't need the log function.

(BTW, to_dec and from_dec are inaccurate names. Your computer doesn't store numbers in base 10.)

dan04 2010-03-24 13:50:12

ansaurus

tags:

views:

answers:

C++: Simplifying my program to convert numbers to from one base to another.

related questions