See also This Post In Word Forum

Or else merging similar records


The problem arises in the automated merging of database records.

I detect that two records are similar in the essential key fields, such as SurName and Given name, and I want to automate the process of determining which data content from the two records is most suitable for my purposes.

Yes, the procedure is risky, because it's not always true that "Bigger is Better", nonetheless, we elect to go ahead with a set of rules.


Rule 1: An empty field has nothing of value to offer to us.

Rule 2: We use a non-empty field in preference to an empty field.

Rule 3: We use a container field (905-474-9146) in preference to a contained field (474-9146).

Rule 4: We use a longer field in preference to a shorter field.

Rule 5: We use a higher-sort sequence field in preference to a lower-sorting field.

Rule 6: We also might consider an option to merge all the data from all the fields where the target field allows this. Thus we might make a huge Comment data field, merging the comment strings from each of the individuals.


I still feel uneasy at recordsets, so I've opted to copy the fields into a string array. I can deal with string arrays, and when I'm done I can delete the first record and load the second record from the string array.

And I can re-use this function any time I have arrays that need merging.


<pre>Public Function ReductionMechanism(strAr() As String)
' merge the elements of strAr(0) into the elements of strAr(1)
' so that strAr(0) can be discarded
' and strAr(1) retained.

Dim i As Integer
For i = 0 To UBound(strAr, 2)
: Rule 1
' If the first string is null, then it has nothing at all to offer us
If strAr(0, i) = "" Then
: Rule 2
' If one string is null, the other must be just as good a pick!
ElseIf strAr(1, i) = "" Then
strAr(1, i) = strAr(0, i)
: Rule 3
' If one string is wholly contained in the other, choose the container.
ElseIf InStr(1, strAr(0, i), strAr(1, i)) > 0 Then
strAr(1, i) = strAr(0, i)
ElseIf InStr(1, strAr(1, i), strAr(0, i)) > 0 Then
strAr(0, i) = strAr(0, i)
: Rule 4
' If one string is longer than the other, choose the longer.
ElseIf Len(strAr(0, i)) > Len(strAr(1, i)) Then
strAr(1, i) = strAr(0, i)
ElseIf Len(strAr(0, i)) < Len(strAr(1, i)) Then
strAr(0, i) = strAr(0, i)
: Rule 5
' If one string sorts higher, choose the higher
ElseIf strAr(0, i) > strAr(1, i) Then
strAr(1, i) = strAr(0, i)
ElseIf strAr(0, i) < strAr(1, i) Then
strAr(0, i) = strAr(0, i)
Else
: Rule 6
' If all else fails, append the first string after the second.
' I don't see how this situation can arise after following all the previous rules!!!
strAr(1, i) = strAr(1, i) & strAr(0, i)
End If
Next i
End Function

Sub TESTReductionMechanism()
Dim strAr(1, 7) As String

strAr(0, 0) = ""
strAr(1, 0) = "first string is null"

strAr(0, 1) = "second string is null"
strAr(1, 1) = ""

strAr(0, 2) = "967-1111"
strAr(1, 2) = "416-967-1111" ' second string is container

strAr(0, 3) = "416-967-1111" ' first string is container
strAr(1, 3) = "967-1111"

strAr(0, 4) = "1234567"
strAr(1, 4) = "abcdefgh" ' second string is longer

strAr(0, 5) = "abcdefgh" ' first string is longer
strAr(1, 5) = "1234567"

strAr(0, 6) = "bcdef" ' first string sorts higher
strAr(1, 6) = "abcde"

strAr(0, 7) = "abcdef"
strAr(1, 7) = "bcdefg" ' second string sorts higher

Call ReductionMechanism(strAr)
End Sub
</pre>