views:

141

answers:

3

I'm not a pro with regexp and especially in this specific context. Any idea how to use a function to do a RegExp Replace and then calling it from the query in Microsoft Access?

By the way, this query oddly enough doesn't work fully and maybe it's why this could use regexp.

The idea is to match the three fields in the temp_table query:

last_name || first_name || middle_initial
Blair     || Sheron     || S
Brown     || Wanda      || R
Rodriguez || Lillian    || M
Glaubman  || Alan       || 

with what is in either names_1 or names_2 field in print_ready:

BLAIR,SHERON S
BROWN,BRENON I H/E BROWN,WANDA R 
RODRIGUEZ,LILLIAN M
GLAUBMAN,ALAN & SHORSTEIN,LILLIAN 

For some strange reason, my query fails to return the four names above, even though they are obviously in the temp_table, as shown above. The trick though is that names_1 and names_2 will have more names than what I need (as shown in the second item in the list above) so I must use Like. And if there's a value in middle_initial column, then I need to check print_ready for that middle initial as well, otherwise I don't check for middle_initial. This ensures if there's a middle initial, it doesnt't return records with first and last name similarities, but only returns the record with the initial as well.

And it still gives me user type undefined for " colMatches As MatchCollection"

Public Function NameMatch(ByVal pLast As String, ByVal pFirst As String, ByVal pMiddle As Variant, ByVal pSearchField As String) As Boolean


Dim strReturn As String


Dim colMatches As MatchCollection
Dim RetStr As String
Dim objRegExp As RegExp

strReturn = "/(\s|[pLast])(\s|,)[pFirst]/"

   If Len(pMiddle) > 0 Then
    strReturn = "/(\s|[pLast])(\s|,)[pFirst]\[spMiddle]/"
End If

Set objRegExp = New RegExp
objRegExp.pattern = strReturn
objRegExp.IgnoreCase = True
objRegExp.Global = True

If (objRegExp.Test(pSearchField) = True) Then
Set colMatches = objRegExp.Execute(pSearchField)

For Each match In colMatches
    record += match.Value
Next

NameString = record

End If

End Function

SELECT
t.last_name,
t.first_name,
t.middle_initial,
p.names_1,
p.names_2
FROM temp_query AS t,
print_ready AS p
WHERE 
NameMatch(t.last_name, t.first_name, t.middle_initial, p.names_1) = True 
Or NameMatch(t.last_name, t.first_name, t.middle_initial, p.names_2) = True
And p.us_states_and_canada In ("FL", "NY");
A: 

Since your query will run within Access, the expression service will be available to allow the query to use a VBA user-defined function.

Public Function NameString(ByVal pLast As String, _
    ByVal pFirst As String, _
    ByVal pMiddle As Variant) As String

    Dim strReturn As String
    strReturn = pLast & "," & pFirst
    If Len(pMiddle & "") > 0 Then
        strReturn = strReturn & " " & pMiddle
    End If
    NameString = strReturn
End Function

Then your query can look for matching name patterns in the print_ready target fields.

SELECT
    t.last_name,
    t.first_name,
    t.middle_initial,
    t.Name2Search,
    p.names_1,
    p.names_2
FROM
    (SELECT
        last_name,
        first_name,
        middle_initial,
        NameString(last_name, first_name, middle_initial) AS Name2Search
    FROM temp_table
    ) AS t,
    print_ready AS p
WHERE
    (p.names_1 Like "*" & t.Name2Search & "*")
    Or (p.names_2 Like "*" & t.Name2Search & "*");

Update: I overlooked your us_states_and_canada restriction. Add an "AND" condition to your WHERE clause.

p.us_states_and_canada In ("FL", "NY")
HansUp
DOes this account for the regexp part? THere can be a comma before the first name, or there can be a space before the first name, or there can be a comma or space before the first name and a space after the first name or it can be end of record. There will always be a space before middle initial, but there may or may not be a space after a middle initial. There may be a comma after last name or there may be a space after last - one or the other. Also there may be a space before last name or it may be the beginning of the record
JohnMerlino
No, I was not familiar with regular expressions as used in your sample, so my answer targeted the sample data instead.
HansUp
I never used VB before. Does it support regexp and have syntax like this: /,[last_name] /g
JohnMerlino
There is `Microsoft VBScript Regular Expressions`, but as I recall the RegEx object syntax is different than what you showed. What you used reminded me more of sed.
HansUp
Thanks for your help. I'm just going to figure out how to do it but presumably I will stick it where you do the assignment? strReturn = /(\s|pLast)(\s|,)pFirst\spMiddle/gi
JohnMerlino
I'm unsure, but my best guess is you should try something like: `Public Function NameMatch(ByVal pLast As String, ByVal pFirst As String, ByVal pMiddle As Variant, ByVal pSearchField As String) As Boolean`
HansUp
Then your WHERE clause might be: `WHERE NameMatch(last_name, first_name, middle_initial, names_1) = True Or NameMatch(last_name, first_name, middle_initial, names_2) = True`
HansUp
Unfortunately I don't know what to add for the "guts" of the function. And since your matching can't take advantage of indexes, it may run slower than a crippled pig.
HansUp
Is there a way I can have it take advantage of indexes, by turning the queries into tables and giving them primary keys?
JohnMerlino
An index on us_states_and_canada might help so that you can retrieve the FL and NY rows faster. But I think the big challenge is that you need two data sources without a JOIN condition, so you may be starting with something similar to a Cartesian product (cross join) of the two, then filtering out rows which don't satisfy the name match criteria in your WHERE clause. I don't see how indexes will help there.
HansUp
I added all your suggestions to my initial question. I put the function in its own module in VBA. And then I ran the query and it said "The Select Statement includes a reserved word or an argument name that is mispelled or missing, or the punctuation is incorrect.
JohnMerlino
The Test method will return True if the Regular Expression was found, and False if it was not found. IOW, Test returns a Boolen value ... which you are assigning to a String variable, which becomes the function's (Boolean) return value ... then comparing comparing that to Boolean True/False in the query's WHERE clause. That doesn't seem right. Have you tested the function in isolation (outside the query) to confirm if returns correct values for test inputs?
HansUp
I was sloppy in the earlier comment. Since your query uses aliases for the table names, include them with the field names you pass to the function ... `NameMatch(t.last_name, t.first_name` ...
HansUp
There is no need for a subquery now; you can use temp_table directly.
HansUp
Your RegExp pattern should be built with the parameter values, not the parameter names.
HansUp
You have a comma in the subsquery after middle_initial. Remove that comma. There should must not be a comma after the last field expression in the SELECT statement's field list. This is probably the cause of your current error message.
HansUp
I now run the query and I get "undefined function NameMatch in Expression. So I go to VBA and select run > debug and it says "user defined type not found" and it selects objMatch as Match in the newly updated code I posted in my original question.
JohnMerlino
I added another answer with a different version of the NameMatch function.
HansUp
A: 

I don't understand RegExp, but looking at @HansUp's answer, this seems entirely equivalent:

  SELECT last_name, first_name, middle_initial, names_1, names_2,
    Mid(("12"+last_name) & (", "+first_name)), 3) 
      & (" "+middle_initial) AS Name2Search
  FROM print_ready
  WHERE
    (names_1 Like "*" & Mid(("12"+last_name) & (", "+first_name)), 3) 
      & (" "+middle_initial) & "*")
    Or (names_2 Like "*" & Mid(("12"+last_name) & (", "+first_name)), 3) 
      & (" "+middle_initial) & "*");

But I may very well be completely misinterpreting what's being sought.

There's no really good way to do this. I would have a look at the data in names_1 and names_2 and see if it couldn't be parsed into indexable fields that could be compared to last_name and first_name and take advantage of those indexes.

David-W-Fenton
Does that account for regexp? I'm trying to find records where last name may appear at beginning of record or have a space before it and have a comma after it or have space, and where first name may have comma before it or space and it may be last record or have space after it and if there is a middle initial, it will have space before it and after it or it may have space before it and be the last item in the record. There may or may not be a middle initial and if there is, then I want to include in query so as not to get first and last names that are similar.
JohnMerlino
@David My original interpretation was wrong. For John Doe, I think he wants to match with "DOE,JOHN", "DOE JOHN", and "DOE, JOHN". It's more challenging than I thought when I submitted that first answer.
HansUp
I'd say he has crappy data and should run a lot of cleanup on it if he wants to get good results that don't take a bazillion years to run.
David-W-Fenton
A: 

Try this NameMatch function.

Public Function NameMatch(ByVal pLast As String, _
    ByVal pFirst As String, _
    ByVal pMiddle As Variant, _
    ByVal pSearchField As String) As Boolean

    Dim strPattern As String
    Dim objRegExp As Object

    ' .* = any characters, or none '
    ' \b = word boundary '
    strPattern = ".*\b" & pLast
    ' comma, space, or comma plus space '
    strPattern = strPattern & "(,|\s|,\s)"
    strPattern = strPattern & pFirst
    If Len(pMiddle & vbNullString) > 0 Then
        strPattern = strPattern & "\s" & pMiddle
    End If
    strPattern = strPattern & "\b.*"
    'Debug.Print strPattern '
    Set objRegExp = CreateObject("VBScript.RegExp")
    objRegExp.Pattern = strPattern
    objRegExp.IgnoreCase = True

    NameMatch = objRegExp.test(pSearchField)
    Set objRegExp = Nothing
End Function

Here is a copy & paste from the Immediate Window where I tested the function:

? NameMatch("Doe","John","A","foo Doe, John A bar")
True
? NameMatch("Doe","John","A","foo Doe,John A bar")
True
? NameMatch("Doe","John","A","foo Doe John A bar")
True
? NameMatch("Doe","John","A","foo Doe, John A")
True
? NameMatch("Doe","John","A","Doe, John A")
True
? NameMatch("Doe","John","A","foo Doe, John bar")
False
? NameMatch("Doe","John",Null,"foo Doe, John bar")
True
? NameMatch("Doe","John",Null,"foo Doe, John A bar")
True

Except for the last one, I think those are the results you wanted. But I don't know about the last one.

HansUp