tags:

views:

499

answers:

2

I am regularly required to compare data sent to me in Excel spreadsheets with data that lives in SQL Server. I know that you can connect SQL Server to spreadsheets but it always seemed clunky

This is really a post to show off my solution but I would love to hear other peoples ideas.

+2  A: 

For best results, paste the below code into a module in your personal.xls file. You will need to add a reference to the Microsoft Forms 2.0 Object Library.

When you run this routine, it takes the currently highlighted region and creates an XML string. It also creates the TSQL to convert that XML into a temporary table called #tmp. It also pastes the TSQL into your clipboard. It makes a lot of assumptions and the default temporary table is all VARCHAR(100).

I bound this routine to Cntl-Shift-X.

The end result is if i highlight a reagion (with header), click Cntl-Shift-X, and past into a query window, I have immediate access to the spreadsheet data in SQL.

I't has save me tons of time.

Recommendations for improvements are welcome :o)

Sub CreateOpenXML()

    Dim cols, rows As Long
    cols = Selection.Columns.Count
    rows = Selection.rows.Count
    Dim Header() As String
    ReDim Preserve Header(cols)
    For i = 1 To cols  '''Each Column In Selection.Rows(0).Columns
        Header(i) = CleanHeader(Selection.Cells(1, i).Value)
        'Header(i) = Application.WorksheetFunction.Substitute(CleanString(Selection.Cells(1, i).Value), " ", "_")
        'Header(i) = Application.WorksheetFunction.Substitute(Header(i), "(", "_")
        'Header(i) = Application.WorksheetFunction.Substitute(Header(i), ")", "_")
        'i = i + 1
    Next
    Dim theXML As String, tmpXML As String, counter As Integer

    theXML = "DECLARE @DocHandle int" & vbCrLf
    theXML = theXML & "DECLARE @XmlDocument varchar(8000)" & vbCrLf
    theXML = theXML & "EXEC sp_xml_preparedocument @DocHandle OUTPUT, N'<theRange>" & vbCrLf
    tmpXML = ""
    counter = 0
    For i = 2 To rows
        tmpXML = tmpXML & vbTab & "<theRow>"
        For j = 1 To cols
            If Selection.Cells(i, j).Text <> "NULL" And Selection.Cells(i, j).Text <> "" Then
                tmpXML = tmpXML & "<" & Header(j) & ">" & CleanString(Selection.Cells(i, j).Text) & "</" & Header(j) & ">"
                'tmpXML = tmpXML & CleanString(Selection.Cells(i, j).Text)
                'tmpXML = tmpXML & "</" & Header(j) & ">"
            End If
        Next j
        tmpXML = tmpXML & "</theRow>" & vbCrLf
        counter = counter + 1
        If counter = 200 Then
            theXML = theXML & tmpXML
            tmpXML = ""
            counter = 0
        End If
    Next i
    theXML = theXML & tmpXML
    theXML = theXML & "</theRange>'" & vbCrLf & vbCrLf
    '''theXML = theXML & "EXEC sp_xml_preparedocument @DocHandle OUTPUT, @XmlDocument" & vbCrLf
    theXML = theXML & "SELECT "
    For i = 1 To cols
        theXML = theXML & "[" & Header(i) & "]"
        If i <> cols Then theXML = theXML & ", "
    Next
    theXML = theXML & vbCrLf
    theXML = theXML & "INTO #tmp"
    theXML = theXML & vbCrLf
    theXML = theXML & "FROM OPENXML (@DocHandle, '/theRange/theRow',2) WITH (" & vbCrLf
    For i = 1 To cols
        theXML = theXML & vbTab & "[" & Header(i) & "] varchar(100)"
        If i <> cols Then theXML = theXML & ","
        theXML = theXML & vbCrLf
    Next
    theXML = theXML & ")" & vbCrLf
    theXML = theXML & "EXEC sp_xml_removedocument @DocHandle" & vbCrLf
    theXML = theXML & vbCrLf
    theXML = theXML & "Select * from #tmp" & vbCrLf
    theXML = theXML & vbCrLf
    theXML = theXML & "--DROP TABLE  #tmp"
    theXML = theXML & vbCrLf
    MsgBox "The XML has been copied to the clipboard"
    Dim dob As New DataObject
    dob.SetText (theXML)
    dob.PutInClipboard

End Sub

Function CleanString(orig As String)
    Dim tmp As String
    tmp = orig
    '''MsgBox InStr(orig, "&")
    If InStr(orig, "&") > 0 Or InStr(orig, "'") > 0 Or InStr(orig, "<") > 0 Or InStr(orig, ">") > 0 Or InStr(orig, """") > 0 Then
        tmp = Application.WorksheetFunction.Substitute(tmp, "&", "&amp;")
        tmp = Application.WorksheetFunction.Substitute(tmp, "'", "&apos;")
        tmp = Application.WorksheetFunction.Substitute(tmp, "<", "&lt;")
        tmp = Application.WorksheetFunction.Substitute(tmp, ">", "&gt;")
        tmp = Application.WorksheetFunction.Substitute(tmp, """", "&quot;")
    End If
    CleanString = tmp

End Function

Function CleanHeader(orig As String)
    Dim tmp As String
    tmp = Trim(orig)
    If InStr(orig, " ") > 0 Or InStr(orig, "(") > 0 Or InStr(orig, ")") > 0 Or InStr(orig, "$") > 0 Or InStr(orig, "/") > 0 Or InStr(orig, "?") > 0 Or InStr(orig, "&") > 0 Or InStr(orig, "'") > 0 Or InStr(orig, "<") > 0 Or InStr(orig, ">") > 0 Or InStr(orig, """") > 0 Then
        tmp = Application.WorksheetFunction.Substitute(tmp, "&", "And")
        tmp = Application.WorksheetFunction.Substitute(tmp, "'", "_")
        tmp = Application.WorksheetFunction.Substitute(tmp, "<", "")
        tmp = Application.WorksheetFunction.Substitute(tmp, ">", "")
        tmp = Application.WorksheetFunction.Substitute(tmp, """", "")
        tmp = Application.WorksheetFunction.Substitute(tmp, " ", "_")
        tmp = Application.WorksheetFunction.Substitute(tmp, "(", "_")
        tmp = Application.WorksheetFunction.Substitute(tmp, ")", "_")
        tmp = Application.WorksheetFunction.Substitute(tmp, "$", "")
        tmp = Application.WorksheetFunction.Substitute(tmp, "/", "")
        tmp = Application.WorksheetFunction.Substitute(tmp, "?", "")
    End If
    CleanHeader = tmp

End Function

Sub MakeText()

    ActiveCell.CurrentRegion.Select
    Dim rng As Range
    Set rng = Selection

    Dim str As String
    For i = 1 To rng.rows.Count
        For j = 1 To rng.Columns.Count
            str = Application.WorksheetFunction.Text(rng.Cells(i, j).Value, "#")
            rng.Cells(i, j).NumberFormat = "@"
            rng.Cells(i, j).Value = str
        Next j
    Next i

End Sub

As suggested, here's an example. Consider this spreadsheet data:

Name              DOB       Score   Comment
John Smith        7/1/1990  93      Great effort
Sue Jones         1/1/1989  95      Super achievement
Robin Sixpack     12/1/1985 100     OK

This method will generate the following TSQL:

DECLARE @DocHandle int
DECLARE @XmlDocument varchar(8000)
EXEC sp_xml_preparedocument @DocHandle OUTPUT, N'<theRange>
    <theRow><Name>John Smith</Name><DOB>7/1/1990</DOB><Score>93</Score><Comment>Great effort</Comment></theRow>
    <theRow><Name>Sue Jones</Name><DOB>1/1/1989</DOB><Score>95</Score><Comment>Super achievement</Comment></theRow>
    <theRow><Name>Robin Sixpack</Name><DOB>12/1/1985</DOB><Score>100</Score><Comment>OK</Comment></theRow>
</theRange>'

SELECT [Name], [DOB], [Score], [Comment]
INTO #tmp
FROM OPENXML (@DocHandle, '/theRange/theRow',2) WITH (
    [Name] varchar(100),
    [DOB] varchar(100),
    [Score] varchar(100),
    [Comment] varchar(100)
)
EXEC sp_xml_removedocument @DocHandle

Select * from #tmp

--DROP TABLE  #tmp
wcm
Could you show that the generated results might look like, just for an example? FYI - this is an awesome example, and would make a great blog post!
Mitchel Sellers
Thanks Mitchell! Tell your friends. It has been a great tool and I'm sort of proud of it even though I plan on making several of the changes that Mike suggested above.
wcm
I just checked out your blog. I'm honored to have a distinguished author say something nice about my little code snippet.
wcm
+1  A: 

I find I tend to break out in a rash when I have to work with spreadsheets containing data in an uncertain format that may change over time.

A couple of observations on the code:

While Application.WorksheetFunction.Substitute does the job, VB/VBA has the Replace function, which is a little more succinct. It's probably not especially important from a performance point-of-view here, but one should generally try to reference the Application object or Workbook/Worksheets as little as possible within code, as the costs of the round-trip from code to app do tend to add up. For that reason, when iterating across a Range, it's usually good sense to load the values into a Variant, as in

Dim values as Variant
values = Selection.Values

and loop over the array to eliminate that round-trip every time you reference .Cells.

I got a little bored of theXML = theXML & - it gets hard to see what's going on. You might consider writing a little StringBuilder class, say, so you could reduce

 theXML = theXML & "INTO #tmp"

to

 sb.Add "INTO #tmp"

The Add method could handle all that & vbCrLf business, too, which would, frankly, be a blessing.

That said, I'm wondering about the business process that requires regular checks of this sort. Is the intention to ensure that the data is the same in both places? Duplication/reconciliation is often a sign of a process in need of some refactoring. If you're looking for differences, might there be a better way of recording them? How might things be changed so that the data can only ever be changed in the database? Just wondering...

Mike Woodhouse
Thanks for the suggestions. I will probably be incorporating many of your suggestions over the next few weeks. As for the business processes, data questions tend to flow towards me. Is it the best use of my time? Probably not. It's not that bad and this tool makes things easier.
wcm