Compare Data

Huey72

New Member
Joined
Nov 6, 2019
Messages
32
Hi everyone, I am migrating data between information systems, and want to compare the source dataset from the legacy system to the migrated dataset in the new system to ensure they match.

To do this, I've imported both datasets into the same workbook, into their respective worksheets, named Source and Migrated. The dataset in both sheets is large: 300 columns x 600,000 rows.

For smaller datasets, I have a macros which will loop through each cell in the data and return a list of mismatch results e.g., where values don't match, the result list would return a list containing values like: Cell: D75 Source value: 1 <> Migrated value: 2, etc.

The problem is to loop through each cell on a dataset this size takes a really long time. I looked at using Access, and it has a 255 column limit that I prefer not to work around. I'm not familiar with PowerQuery enough to do this in a way that will return the clear result. I think this can be done using the scripting dictionary in a much faster way than looping through cells, but I can't get the syntax right.

The following code is sort of doing what I want it to do, but it's currently returning the complete record from each dataset that doesn't match, rather than only the specific value. Rather than returning two records for the mismatch, I'd prefer to have one record for each specific difference e.g., Cell: D75 Source value: 1 <> Migrated value: 2, Cell: JK 525,431 Source value: A <> Migrated value: B, etc.. To do this, I think the code may need to be changed to nest the second dictionary within the first, to then iterate through the items of each dictionary, compare and return the mismatch.

I'm new to using the scripting dictionary and would like to learn this - any guidance on this would be greatly appreciated. Thanks!


VBA Code:
Sub CompareData()

Dim vstr As String
Dim vData As Variant
Dim vitm As Variant
Dim vArr As Variant
Dim v()
Dim currentSht As String
Dim shtResults As String

Dim a As Long
Dim b As Long
Dim c As Long

Application.ScreenUpdating = False

shtResults = "Results"

'If Sheet doesn't exist, create it
If SheetExists(shtResults) = True Then
    Application.DisplayAlerts = False
    Sheets(shtResults).Delete
    Application.DisplayAlerts = True
    Sheets.Add
    ActiveSheet.Name = shtResults
Else
    Sheets.Add
    ActiveSheet.Name = shtResults
End If

vData = Sheets("Source").Range("A1:M11").Value
currentSht = "Source"

With CreateObject("Scripting.Dictionary")
        .CompareMode = 1
        
        ReDim v(1 To UBound(vData, 2))
        
        For a = 2 To UBound(vData, 1)
            
            For b = 1 To UBound(vData, 2)
                vstr = vstr & Chr(2) & vData(a, b)
                'Debug.Print vstr
                v(b) = "[Cell: " & a & ", " & b & " -- " & currentSht & "] " & vData(a, b)
            Next
            
            .Item(vstr) = v
            vstr = ""
            
        Next
        
        vData = Sheets("Migrated").Range("A1:M11").Value
        currentSht = "Migrated"

        For a = 2 To UBound(vData, 1)
            
            For b = 1 To UBound(vData, 2)
                vstr = vstr & Chr(2) & vData(a, b)
                v(b) = "[Cell: " & a & ", " & b & " -- " & currentSht & "] " & vData(a, b)
            Next
        
            If .Exists(vstr) Then
                .Item(vstr) = Empty
                Else
                .Item(vstr) = v
            End If
            
            vstr = ""
        Next
        
        For Each vitm In .Keys
            If IsEmpty(.Item(vitm)) Then
            .Remove vitm
            End If
        Next
            
        vArr = .Items
        c = .Count

End With

With Sheets(shtResults).Range("A1").Resize(, UBound(vData, 2))

    .Cells.Clear
    .Value = vData
    
    If c > 0 Then
        .Offset(1).Resize(c).Value = Application.Transpose(Application.Transpose(vArr))
        
    End If
    
End With

Application.ScreenUpdating = True

End Sub
 
Great, thank you again - this was a relief to sort out and I was having a hard time solving for this - you made it look easy :).
 
Upvote 0

Excel Facts

What do {} around a formula in the formula bar mean?
{Formula} means the formula was entered using Ctrl+Shift+Enter signifying an old-style array formula.

Forum statistics

Threads
1,214,387
Messages
6,119,225
Members
448,877
Latest member
gb24

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top