views:

2360

answers:

3

As part of an overhaul of a report generator I saw what I believed to be inefficient code. This part of the code runs after the main report is generated to set the page breaks in logical positions. The criteria is this:

  • Each Site starts on a new page.
  • Group's aren't allowed to broken across pages.

The code follows the above format: 2 loops doing those jobs.

This is the original code (sorry for the length):

Public Sub PageBreak(ByRef wstWorksheet As Excel.Worksheet, ByVal pctProgress As ProgressCtl.ProgressControl)
Dim breaksMoved As Integer
Dim p As HPageBreak
Dim i As Integer

'Used as a control value
breaksMoved = 1

' Marks that no rows/columns are to be repeated on each page
wstWorksheet.Activate
wstWorksheet.PageSetup.PrintTitleRows = ""
wstWorksheet.PageSetup.PrintTitleColumns = ""

'If this isn't performed beforehand, then the HPageBreaks object isn't available
Range("A3").Select
ActiveWindow.View = xlPageBreakPreview

'Defaults the print area to be the entire sheet
wstWorksheet.DisplayPageBreaks = False
wstWorksheet.PageSetup.PrintArea = ""

Range("$B$4").Select

' add breaks after each site
Do While ActiveCell.Row <= wstWorksheet.UsedRange.Rows.Count
    If ActiveCell.FormulaR1C1 = "Site ID" Then
        ActiveCell.PageBreak = xlPageBreakManual
    End If
    ActiveCell.Offset(1, 0).Activate
    pctProgress.ProgressText = "Row " & CStr(ActiveCell.Row) & " of " & CStr(wstWorksheet.UsedRange.Rows.Count)
Loop

Dim passes As Long
Do While breaksMoved = 1
    passes = passes + 1
    breaksMoved = 0
    For i = 1 To wstWorksheet.HPageBreaks.Count - 1
            Set p = wstWorksheet.HPageBreaks.Item(i)
            'Selects the first page break
            Range(p.Location.Address).Select
            'Sets the ActiveCell to 1 row above the page break
            ActiveCell.Offset(-1, 0).Activate

            'Move the intended break point up to the first blank section
            Do While Not ActiveCell.FormulaR1C1 = ""
                ActiveCell.Offset(-1, 0).Activate
                breaksMoved = 1
            Loop

            'Add the page break
            If ActiveCell.FormulaR1C1 <> "Site ID" Then
                ActiveCell.Offset(1, 0).Activate
                wstWorksheet.HPageBreaks.Add ActiveCell
            End If

            pctProgress.ProgressText = "Set break point " & CStr(passes) & "." & CStr(i)

    Next

Loop

'Reset the view to normal
wstWorksheet.DisplayPageBreaks = True
ActiveWindow.View = xlNormalView
Range("A3").Select
End Sub

Seeing room for improvement I set about modifying this. As one of the new requirements the people wanting the report were manually removing pages prior to printing. So I added checkboxes on another page and copied the checked items across. To ease that I used named ranges. I used these named ranges to meet the first requirement:

' add breaks after each site   
For Each RangeName In ActiveWorkbook.Names
    If Mid(RangeName.Name, 1, 1) = "P" Then
        Range(RangeName).Activate
        ActiveCell.Offset(Range(RangeName).Rows.Count - 1, 0).Select
        ActiveCell.PageBreak = xlPageBreakManual
    End If
Next RangeName

All Ranges are prefixed with P_ (for parent). Using the lame Now() style of rough timing this is 1 second slower on my short 4 site report and the more challenging 15 site report. These have 606 and 1600 rows respectively.

1 second isn't so bad. Lets look at the next criteria. Each logical group is split by a blank row, so the easiest way is to find the next page break, step back until you find the next blank line and insert the new break. Rinse and repeat.

So why does the original run through multiple times? We can improve that too (the boiler plate outside the loops is the same).

Dim i As Long
Dim oPageBreak As HPageBreak
Do While i < shtDeliveryVariance.HPageBreaks.Count - 1
    i = i + 1
    pctProgress.ProgressText = "Setting Page Break " & CStr(i) & " of " & CStr(shtDeliveryVariance.HPageBreaks.Count)

    Set oPageBreak = shtDeliveryVariance.HPageBreaks.Item(i)

    ' select the page break
    Range(oPageBreak.Location.Address).Select
    ActiveCell.Offset(-1, 0).Activate

    ' move up to a free row
    Do While Not ActiveCell.FormulaR1C1 = ""
        ActiveCell.Offset(-1, 0).Activate
    Loop

    'Add the page break
    If ActiveCell.FormulaR1C1 <> "Site ID" Then
        ActiveCell.Offset(1, 0).Activate
        shtDeliveryVariance.HPageBreaks.Add ActiveCell
    End If

Loop

One pass and more elegant too. But how much quicker is it? On the small test is takes 54 seconds compared to the original 45 seconds, and on the larger test my code is slower again at 153 to 130 seconds. And this is averaged over 3 runs too.

So my questions are: Why is my newer code so much slower than the original despite mine looking faster and what can I do to speed up the slowness of the code?

Note: Screen.Updating, etc. is already off as is Calculation etc.

+1  A: 

I took a quick view of your code and my first thought is that this line:

pctProgress.ProgressText = "Setting Page Break " & CStr(i) & " of " & CStr(shtDeliveryVariance.HPageBreaks.Count)

may be a cause of some of the delay. The location of this code means that the system has to go and recalculate the .Count value since it comes at the beginning of the loop in your code, but this recalculation does not happen in the original.

Other thoughts:

Depending on the spreadsheet size, going out and remeasuring this value may be slowing things down. Why not just manually increment a breaks count tracking variable when you actually perform the addition of a new break instead of having the system go and count it, or get rid of the counting in the loop (since you're not updating the display anyways during this process) and put the counting of page breaks in to its own code segment that runs through the content at the end of the whole formatting process when a final number of page breaks can easily be determined with a single call?

Ron McMahon
+2  A: 

Hey Graham,

The easy answer is that you use ActiveCell and Select and Activate. Excel actually selects the cells as your code is running, making the code run slower (as you've noticed).

I would recommend using a Range as a reference and do all the tests "in memory".

Dim a range for tracking (dim rngCurrentCell as range) and use that instead of the selecting the cells.

So, for the first appearance of Select in your code "Range("A3").Select", you would 'Set' it as 'Set rngCurrentCell = Range("A3")'. The same for the Next B4 line.

Then:`

' add breaks after each site
Do While ActiveCell.Row <= wstWorksheet.UsedRange.Rows.Count 

If ActiveCell.FormulaR1C1 = "Site ID" Then
ActiveCell.PageBreak = xlPageBreakManual    
End If    
' Offset the row by one and set our new range
set rngCurrentCell = rngCurrentCell.Offset(1, 0)

pctProgress.ProgressText = "Row " & CStr(ActiveCell.Row) & " of " & CStr(wstWorksheet.UsedRange.Rows.Count)

Loop

` And so forth.

Now to test values use the same syntax as the "ActiveCell".

Any questions, let me know,

Just Plain, Bill

JustPlainBill
+6  A: 

Hi Graham,
I see room for improvement in a couple spots in your code:

  1. Don't access properties that are implemented slowly, like usedrange.rows.count more than once(particularly inside a loop) unless you think they may have changes. Instead store them in a variable.
  2. Don't do text comparisons if you can avoid it (Ex: .Value = ""), instead use the LenB function to check for emptiness, it will execute faster as it's just reading the length of the string header instead of launching into a byte by byte string comparison. (You might enjoy this for reading.)
  3. Don't use "Activate" or "Select" to move around the ActiveCell, just access the range directly.
  4. When looping, structure your loop to have to perform as few tests as possible. If the loop must always execute once, then you want a post-test loop.
  5. Make sure you have the Excel interface locked, as running events and screen-updating etc, can slow your code down a lot. (Especially events.)
  6. Finally, I noticed that you are making assumptions about the case of "Site ID", unless there is no possible way it could be cased otherwise, it's best to do a case insensitive comparison. If you know for a fact that it will be Cased that way you can of course remove the calls to LCase$ that I added.

I refactored the original code to give you an example of some of these ideas. Without knowing your data layout, it's hard to be sure if this code is 100% valid, so I would double check it for logic errors. But it should get you started.

Public Sub PageBreak(ByRef wstWorksheet As Excel.Worksheet, ByVal pctProgress As ProgressCtl.ProgressContro)
        Const lngColSiteID_c As Long = 2&
        Const lngColSiteIDSecondary_c As Long = 1&
        Const lngOffset_c As Long = 1&
        Dim breaksMoved As Boolean
        Dim lngRowBtm As Long
        Dim lngRow As Long
        Dim p As Excel.HPageBreak
        Dim i As Integer
        Dim passes As Long
        Dim lngHBrksUprBnd As Long
        LockInterface True
        ' Marks that no rows/columns are to be repeated on each page
        wstWorksheet.Activate
        wstWorksheet.PageSetup.PrintTitleRows = vbNullString
        wstWorksheet.PageSetup.PrintTitleColumns = vbNullString


        'If this isn't performed beforehand, then the HPageBreaks object isn't available
        '***Not true:)***

        'ActiveWindow.View = xlPageBreakPreview

        'Defaults the print area to be the entire sheet
        wstWorksheet.DisplayPageBreaks = False
        wstWorksheet.PageSetup.PrintArea = vbNullString

        ' add breaks after each site
        lngRowBtm = wstWorksheet.UsedRange.Rows.Count
        For lngRow = 4& To lngRowBtm
            'LCase is to make comparison case insensitive.
            If LCase$(wstWorksheet.Cells(lngRow, lngColSiteID_c).value) = "site id" Then
                wstWorksheet.Cells(lngRow, lngColSiteID_c).PageBreak = xlPageBreakManual
            End If
            pctProgress.ProgressText = ("Row " & CStr(lngRow)) & (" of " & CStr(lngRowBtm))
        Next

        lngHBrksUprBnd = wstWorksheet.HPageBreaks.Count - lngOffset_c
        Do  'Using post test.
            passes = passes + lngOffset_c
            breaksMoved = False
            For i = 1 To lngHBrksUprBnd
                Set p = wstWorksheet.HPageBreaks.Item(i)
                'Move the intended break point up to the first blank section
                lngRow = p.Location.Row - lngOffset_c
                For lngRow = p.Location.Row - lngOffset_c To 1& Step -1&
                    'Checking the LenB is faster than a string check.
                    If LenB(wstWorksheet.Cells(lngRow, lngColSiteIDSecondary_c).Formula) = 0& Then
                        lngRow = lngRow - lngOffset_c
                        If LCase$(wstWorksheet.Cells(lngRow, lngColSiteIDSecondary_c).value) = "site id" Then
                            breaksMoved = True
                            wstWorksheet.HPageBreaks.Add wstWorksheet.Cells(lngRow, lngColSiteIDSecondary_c)
                        End If
                        Exit For
                    End If
                Next
                pctProgress.ProgressText = "Set break point " & (CStr(passes) & "." & CStr(i))
            Next
        Loop While breaksMoved
        LockInterface False
    End Sub

    Private Sub LockInterface(ByVal interfaceOff As Boolean)
        With Excel.Application
            If interfaceOff Then
                .ScreenUpdating = False
                .EnableEvents = False
                .Cursor = xlWait
                .StatusBar = "Working..."
            Else
                .ScreenUpdating = True
                .EnableEvents = True
                .Cursor = xlDefault
                .StatusBar = False
            End If
        End With
    End Sub
Oorang
I would of thought that .Count would actually access a member variable and not a full scan, but if that is the way it is, then I will 'contrast and compare'.
graham.reeds
@Oorang: If I could give you 2 upvotes I would - this is a lot of good information.
graham.reeds
Never knew that about 2). That's a good link.
graham.reeds
The proof, they say is in the pudding. The original method on a 4883 rows clocks in at 830 seconds. Your revised method clocks in at a staggering 131. Need to take what I have learnt here and apply it to the other functions that are slow (ie: The functions that create the 4883 rows...)
graham.reeds
Happy to help. Regarding properties: there isn't *always* that much of a gain, it depends on how the author of the class implemented the property:) It's just one of those things you can do to resolve the doubt in your favor.http://msdn.microsoft.com/en-us/library/sk5e8eth(VS.80).aspx
Oorang
The problem that required the line to be added to enable the HBreak object has come back: I reenabled all the disabled sheets (to speed testing of the parts I were working on) and now when it gets to the repositioning of hbreaks it get's to 50 items in (the number of initial hbreaks added) and gives a subscript error. However the Count says there are 120+ items in the array. It appears to be related to this kb article: http://support.microsoft.com/kb/210663
graham.reeds