views:

1335

answers:

4

I have a huge (~950 variables) survey response spreadsheet, made 2-4x larger than it needs to be because each item's individual response options are reported in separate columns. E.g., If Question 2 is in cell A1 and has 3 response options, these are listed below the question in cells A2-C2, A3-C3, etc. Note: Only one of A2-C2, etc. is filled w/ data for each observation.

It seems clear that I could go through the spreadsheet manually and concatenate A2-C2 using code similar to this:

=CONCATENATE(IF(ISBLANK(A4),"",A4),IF(ISBLANK(B4),"",B4),IF(ISBLANK(C4),"",C4))

But, the above requires manually altering the code (questions have anywhere from 2-6 response options) and copying/pasting it through all observations for each of the 960 variables individually. Because I'd like to finish this before dying of old age, I'd love to have some way of automating the above process.

It would be easy enough to go through and insert an identifier column (w/ no data but w/ some obvious 1st row name) after each variable's last response option so the code knows where to begin looking back for data to concatenate. The code would have to know to stop whenever it hits the previously concatenated result, and begin the current concatenation w/ the subsequent column. Once it hits a certain line (say, 60? I only have ~55 observations), it could just look for the next identifier column and repeat the process.

Any ideas would be much appreciated.

Thanks!

+1  A: 
e.James
Used your code, but the line "=MATCH(COLUMN(),58:58,1)" returns a value of "16384" across the entire row with the error message "Formula Refers to Empty Cell". When I trace the empty cells, a red arrow begins in A58 and extends to the current cell (from which i traced the error). Any ideas?
That's strange. I tried this out in my copy of Excel, and it worked just fine. Are the question cells in row 1 merged at all?
e.James
Not sure if it will help, but I usually turn off all background error checking in Excel. That gets rid of those pesky green warnings. See http://www.mackb.com/Uwe/Forum.aspx/excel/2600/Little-green-triangles
e.James
It should also be ROW(A2) instead of ROW(A3) in that formula. I've made the change in my answer
e.James
A: 

@EJames: Thanks for the suggestion & pointer to lay out the example graphically. I need to check for blanks so as to not have extra spaces in the data (to aid future analysis).

Hopefully the below will clarify the situation further. You're correct in your earlier comment; each question is followed by 55 rows of observations. For example:

 | |   Q1   |         |        ||   Q2    |        || 
 |-|--------|---------|--------||---------|--------||
 |1|   1    |         |        ||         |    2   ||
 |2|        |    2    |        ||         |    2   ||
 |3|        |         |    3   ||         |    2   ||
 |4|        |    2    |        ||    1    |        ||
 |5|        |         |    3   ||         |        ||
 |6|   1    |         |        ||         |    2   ||
 |7|   1    |         |        ||    1    |        ||
 |8|        |         |    3   ||    1    |        ||
 |9|        |    2    |        ||         |    2   ||

There isn't currently a blank column after the last response option for each question, but (as mentioned in the initial post) I can easily throw one in.

Much obliged.

A: 

The fastest way:

  1. Add an new column after column D
  2. Put 'Q1' in cell B1
  3. Put the following formula in: =SUM(B2:D2)
  4. Copy the formula down to the last row
  5. Repeat the steps above for all questions
  6. Select all data (ctrl-*)
  7. Copy the selection
  8. Choose 'Paste Special' from the context menu and choose 'values'
  9. Delete the original columns

If you want you can create a macro that does this automatically.

Here is the macro. It is my no means my best piece of coding ever. What would you expect in 15 minutes. It does the job, although it crashes when finished. ;o)

  1. Open your excel sheet
  2. Make a back up copy
  3. Hit Alt-F11
  4. Insert a new module
  5. Paste the code below
  6. Put the cursor inside the macro
  7. Hit F8 to step through the code

Since you are in Stack Overflow I assume you will be able to adjust the macro to further tailor your needs.

Sub Main()

    Dim ColumnsCount As Integer
    ColumnsCount = Range("A1").CurrentRegion.Columns.Count

    For i = 2 To 20000

        Dim CurrentCell As Range
        Set CurrentCell = Range("A1").Offset(0, i - 1)

        If CurrentCell.Value <> "" Then

            CurrentCell.Select
            Selection.End(xlToRight).Select

            Dim AnswersCount As Integer
            AnswersCount = Selection.Column - CurrentCell.Column

            CurrentCell.Offset(0, AnswersCount).Select
            Selection.EntireColumn.Insert

            Selection.Value = CurrentCell.Value
            i = i + AnswersCount

            Selection.Offset(1, 0).Select
            Selection.FormulaR1C1 = "=SUM(RC[" + CStr(AnswersCount * -1) + "]:RC[-1])"

            Selection.Copy
            Range(Selection, Selection.Offset(100, 0)).Select
            ActiveSheet.Paste

            Selection.EntireColumn.Select
            Application.CutCopyMode = False
            Selection.Copy
            Selection.PasteSpecial Paste:=xlPasteValues, Operation:=xlNone, SkipBlanks:=False, Transpose:=False

        End If

    Next i

End Sub
Florian
a good idea, and right on if i had a macro to automate it. i'm trying not to have to repeat the action for all 960 questions. unfortunately i haven't yet delved into the world of macros, so am not sure how to write one. also, can a macro handle varying #s of columns for each item? thanks.
awesome. I appreciate the effort - will try it out as soon as I have a chance (probably tomorrow at some point) and post how it worked.
If it works, vote it up. If it doesn't... too bad. I'm only holiday next week. ;o)
Florian
This code doesn't seem to work correctly (with my sheet, at least). It does fine inserting the columns (and formulas, I think), but all new cells return "=SUM(RC[-5]:RC[-1])" or something similar (no values). When I change the cell format (e.g., to numeric), nothing happens. Any ideas? Thanks!
Just got back from my vacation... send me the sheet to [email protected] and I will have a look at it.
Florian
A: 

Thanks for the piece of information. Apart from your way one can also use the & character for concatenation, so instead of: =concatenate(B1,” “,C1,” “,D1) you could use: =B1 & ” ” & C1 & ” ” & D1