tags:

views:

293

answers:

5

The "impossible" K&R exercise.

"Write a program entab that replaces strings of blanks by the minimum number of tabs and blanks to achieve the same spacing. Use the same tab stops, say every n columns. Should n be a variable or a symbolic parameter?"

The problem I'm having is, I'm unsure about how to even do this correctly. I know it's not very explanatory, but that's pretty much the problem here. Most of the examples I've seen have counted a number of blanks, and replaced those series with a tab, but this isn't what its asking, I reckon I understand what its asking, but currently feel unable to do this.

Could anyone help :)

Edit: The code I've written so far can be found here.

A: 

My understanding is that you don't really have to know what the problem is or how to solve it in order to answer this question. The question seems to asking whether you understand when to use variables instead of "symbolic parameters". I'm not actually sure what is meant by "symbolic parameter"; it seems to be outdated nomenclature.

Having said that, solving the first part of the question (replacing spaces with tabs) is rather straight forward. Think division and remainders.

Whisty
I think symbolic parameter here would be a #define
asdfg
A: 

I took a very cursory look at your code, and nothing is jumping out at me as blatantly wrong.

So my advice would be to either single-step through a few input examples in a debugger, examining variable values as you go, or add a whole bunch of debugging print statements. In either case, your goal is to find the point where the state of the program starts to deviate from what you expected or intended.

Jim Lewis
My program doesn't at all solve the solution Jim :/
asdfg
A: 

I agree with your assessment. It won't be enough to replace every n blanks with a tab; for example, if n == 4, "hi blank blank blank blank" should not be replaced by "hi tab", but rather by "hi tab blank blank".

It sounds like what you need to do is keep track of the current position as you're reading in each line, and use this information to determine how many tabs you need. Does this help? Please let me know if you need more details!

As for the "variable vs. symbolic parameter" part, either would definitely be viable, but I can think of one significant advantage to using a variable: you can run the program for different values of n without recompiling.

Elliott
Still not getting it :)
asdfg
+8  A: 

If your question is "What is this asking me to do?" I think I can help by paraphrasing the original question (posing the same question in a different way).

Write a program that takes as input text with spaces and produces as output visually equivalent text using tabs to the maximum extent possible.

For example, with tabstops every 8 characters, and showing spaces as '.' and tabs as '-';

input;
".foo:...bar;......#comment"
output;
".foo:-bar;-..#comment"

input;
".......-foo:.....bar;......#comment"
output;
"-foo:-.bar;-...#comment"

Write the program so that tabstop parameter n can be varied, i.e. allow values of n other than 8. Be prepared to justify your decision to make n a constant, or alternatively a variable.

Edit I had a look at your code and I think it is more complex than it needs to be. My advice is to do it a character at a time. There's no need to buffer a whole line. Maintain a column count as you read each character ('\n' resets it to zero, '\t' bumps it by 1 or more, other characters increment it). When you see a space (or tab), don't emit anything right away, start your entabbing process, emit zero or more tabs and then spaces later (at '\n' or a non whitespace character, whichever comes first).

A final hint is that a state machine can make this kind of algorithm a lot easier to write, validate, test and read.

Edit 2 In a shameless attempt to get the OP to accept my answer, I have now gone ahead and actually coded a solution myself, based on the hints I offered above and my comment in the discussion.

// K&R Exercise 1-21, entab program, for Stackoverflow.com
#include <stdio.h>
#define N 4     // Tabstop value. Todo, make this a variable, allow
                //  user to modify it using command line

int main()
{
    int col=0, base_col=0, entab=0;

    // Loop replacing spaces with tabs to the maximum extent
    int c=getchar();
    while( c != EOF )
    {

        // Normal state
        if( !entab )
        {

            // If whitespace goto entab state
            if( c==' ' || c=='\t' )
            {
                entab = 1;
                base_col = col;
            }

            // Else emit character
            else
                putchar(c);
        }

        // Entab state
        else
        {

            // Trim trailing whitespace
            if( c == '\n' )
            {
                entab = 0;
                putchar( '\n' );
            }

            // If not whitespace, exit entab state
            else if( c!=' ' && c!='\t' )
            {
                entab = 0;

                // Emit tabs to get close to current column position
                //  eg base_col=1, N=4, col=10
                //  base_col + 3 = 4 (1st time thru loop)
                //  base_col + 4 = 8 (2nd time thru loop)
                while( (base_col + (N-base_col%N)) <= col )
                {
                    base_col += (N-base_col%N);
                    putchar( '\t' );
                }

                // Emit spaces to close onto current column position
                // eg base_col=1, N=4, col=10
                //  base_col -> 8, and two tabs emitted above
                //  base_col + 1 = 9 (1st time thru this loop)
                //  base_col + 1 = 10 (2nd time thru this loop)
                while( (base_col + 1) <= col )
                {
                    base_col++;
                    putchar( ' ' );
                }

                // Emit buffered character after tabs and spaces
                putchar( c );
            }
        }

        // Update current column position for either state
        if( c == '\t' )
            col += (N - col%N); // eg col=1, N=4, col+=3
        else if( c == '\n' )
            col=0;
        else
            col++;

        // End loop
        c = getchar();
    }
    return 0;
}
Bill Forster
The main catch is dealing with 3 blanks followed by a tab at the start of a line, or other similar sequences of interleaved blanks and tabs. You need to omit the 3 blanks from the output for any tab stop of size 4 or larger.
Jonathan Leffler
@Jonathan Leffler. Good point, I have added an example that illustrates that wrinkle.
Bill Forster
I understand what you're saying, but I don't understand how to articulate it in code. :(
asdfg
Where you say " ... '\t' bumps it up by 1 or more, other characters increment it", could you elaborate please?
asdfg
@asdfg I wonder if I can do this in a comment; Initially col=0. We read 'a'. Emit 'a' and now col=1. Read 'b'. Emit 'b' and now col=2. Read ' '. Go into entab state, set base_col=2, emit nothing and now col=3. Read '\t'. Stay in entab state, calculate new col=8 (so it increased by 5 in this case). Read ' '. Stay in entab state now col=9. Read 'c'. Exit entab state, use base_col and col to calculate how many tabs then spaces to emit, then emit '\t',' '. Then emit 'c' and increment col to col=10. So input so far "ab.-.c", output so far "ab-.c" (one space has been eliminated). Get it ?
Bill Forster
@asdfg Just alerting you that I've now written the code to go with my analysis (Edit 2 in my answer)
Bill Forster
Bill, I just wrote a testcase, here it is: http://codepad.org/9LKvVB6VReceived a bit more help in terms of a blueprint on how it'd be done, and pretty much most of the code, but I understand it now I guess :xthanks
asdfg
A: 

I am currently plowing KnR and came across this page:

Answers to Exercises

Your exercise are located under:

Hopefully you find this useful.

Sincerely, Morpfh

1: http://users.powernet.co.uk/eton/kandr2/index.html "The C Programming Language", 2nd edition, Kernighan and Ritchie - Answers to Exercises

Morpfh
Morpfh