VBA: Calling Scripting.Dictionary Values Using a Column as a Key

thorspear

New Member
Joined
Jan 29, 2013
Messages
3
Hello all,

I have a Master list of 230k IDs that correspond to different values depending on the raw data source. I have 30 Source sheets with ~140k IDs each, the sort and number of which are unique to each sheet. The Source sheets contain 2 columns, an ID column and a value ("Bucket") column (excuse all the tables but I understand it might be easier with a visual):

Master
[TABLE="width: 100"]
<tbody>[TR]
[TD]ID[/TD]
[TD]Source1[/TD]
[TD]Source2[/TD]
[TD]Source3[/TD]
[/TR]
[TR]
[TD]1[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]2[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]3[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]4[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]5[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]6[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]7[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]8[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]9[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]10[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
</tbody>[/TABLE]

Source1
[TABLE="width: 100"]
<tbody>[TR]
[TD]ID[/TD]
[TD]Bucket[/TD]
[/TR]
[TR]
[TD]1[/TD]
[TD]A[/TD]
[/TR]
[TR]
[TD]5[/TD]
[TD]B[/TD]
[/TR]
[TR]
[TD]7[/TD]
[TD]A[/TD]
[/TR]
[TR]
[TD]8[/TD]
[TD]C[/TD]
[/TR]
</tbody>[/TABLE]

Source2
[TABLE="width: 100"]
<tbody>[TR]
[TD]ID[/TD]
[TD]Bucket[/TD]
[/TR]
[TR]
[TD]2[/TD]
[TD]B[/TD]
[/TR]
[TR]
[TD]3[/TD]
[TD]B[/TD]
[/TR]
[TR]
[TD]7[/TD]
[TD]C[/TD]
[/TR]
[TR]
[TD]9[/TD]
[TD]D[/TD]
[/TR]
[TR]
[TD]10[/TD]
[TD]A[/TD]
[/TR]
</tbody>[/TABLE]

Source3
[TABLE="width: 100"]
<tbody>[TR]
[TD]ID[/TD]
[TD]Bucket[/TD]
[/TR]
[TR]
[TD]2[/TD]
[TD]C[/TD]
[/TR]
[TR]
[TD]4[/TD]
[TD]D[/TD]
[/TR]
[TR]
[TD]6[/TD]
[TD]A[/TD]
[/TR]
[TR]
[TD]9[/TD]
[TD]C[/TD]
[/TR]
</tbody>[/TABLE]

Ordinarily, one could do a VLOOKUP on the ID column in the master sheet and return the value in source sheet, but due to the size of these files, VLOOKUPs are no longer an option. I would like to create a scripting dictionary for each source which then uses the ID column in the master to populate the Bucket value. The outcome would look something like this:

Master
[TABLE="width: 100"]
<tbody>[TR]
[TD]ID[/TD]
[TD]Source1[/TD]
[TD]Source2[/TD]
[TD]Source3[/TD]
[/TR]
[TR]
[TD]1[/TD]
[TD]A[/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]2[/TD]
[TD][/TD]
[TD]B[/TD]
[TD]C[/TD]
[/TR]
[TR]
[TD]3[/TD]
[TD][/TD]
[TD]B[/TD]
[TD][/TD]
[/TR]
[TR]
[TD]4[/TD]
[TD][/TD]
[TD][/TD]
[TD]D[/TD]
[/TR]
[TR]
[TD]5[/TD]
[TD]B[/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]6[/TD]
[TD][/TD]
[TD][/TD]
[TD]A[/TD]
[/TR]
[TR]
[TD]7[/TD]
[TD]A[/TD]
[TD]C[/TD]
[TD][/TD]
[/TR]
[TR]
[TD]8[/TD]
[TD]C[/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]9[/TD]
[TD][/TD]
[TD]D[/TD]
[TD]C[/TD]
[/TR]
[TR]
[TD]10[/TD]
[TD][/TD]
[TD]A[/TD]
[TD][/TD]
[/TR]
</tbody>[/TABLE]

Here is the code that I have so far. Credit goes to the excellent walk through by matthewspatrick at Experts Exchange which can be found here:

Code:
Sub Rosetta()


    Dim Contents As Variant
    Dim r As Long
    Dim dic As Object
    Dim ID As String
    Dim bucket As String
    Dim Keys As Variant
    
    With Application
        .Calculation = xlCalculationManual
        .ScreenUpdating = False
    End With
    
    With ThisWorkbook.Worksheets("Buckets")
        
        .Range("c1").Resize(1, .Columns.Count - 2).EntireColumn.Delete
        Contents = .Range("a2", .Cells(.Rows.Count, "B").End(xlUp)).Value
        
        Set dic = CreateObject("Scripting.Dictionary")
        For r = 1 To UBound(Contents, 1)
            ID = Contents(r, 1)
            bucket = Contents(r, 2)
            
            If dic.Exists(ID) Then


                If bucket < dic.Item(ID) Then dic.Item(ID) = bucket
            Else
                
                dic.Add ID, bucket
            End If
        Next


    End With
    
    With Application
        .Calculation = xlCalculationAutomatic
        .ScreenUpdating = True
    End With
    
    MsgBox "Done"
    
End Sub

If I've done this correctly, this should create one of the 30 dictionaries. What I am stuck on is how to then call the values from this dictionary on the Master sheet. Ideally the above "create" code would be in each Source sheet, and the "call" code in the Master file would run all 30 dictionaries one after another.

I am also open to other ideas of how to solve this problem if I've tried to create too elegant a solution and made more work for myself.

Thank you for your help!
Kirsten
 

Excel Facts

Lock one reference in a formula
Need 1 part of a formula to always point to the same range? use $ signs: $V$2:$Z$99 will always point to V2:Z99, even after copying
I can understand the desire not to use VLookup for 6.9M formulas. But read on...

The dictionaries only exist for the duration of the macro. They are not kept with the sheet and automatically updated if the sheet changes.

Is the master sheet supposed to automatically update if any of the other sheets are changed? The way that you are using the dictionary will only keep the smallest value for each ID.

Does the master sheet had the list of IDs in Column A and the names of the other sheets across the top (B1-Z1)?
 
Upvote 0
I can understand the desire not to use VLookup for 6.9M formulas. But read on...

The dictionaries only exist for the duration of the macro. They are not kept with the sheet and automatically updated if the sheet changes.

Is the master sheet supposed to automatically update if any of the other sheets are changed? The way that you are using the dictionary will only keep the smallest value for each ID.

Does the master sheet had the list of IDs in Column A and the names of the other sheets across the top (B1-Z1)?

No, the master sheet does not need to update automatically, great call out.

It does only keep the smallest value, but each list (both Source and Master) is unique, so that case should hopefully never arise.

Yes, the Master has IDs in A, and the Source sheet names across the top.
 
Upvote 0
Hello all,

I have a Master list of 230k IDs that correspond to different values depending on the raw data source. I have 30 Source sheets with ~140k IDs each, the sort and number of which are unique to each sheet. The Source sheets contain 2 columns, an ID column and a value ("Bucket") column (excuse all the tables but I understand it might be easier with a visual):
...
...

If I've done this correctly, this should create one of the 30 dictionaries. What I am stuck on is how to then call the values from this dictionary on the Master sheet. Ideally the above "create" code would be in each Source sheet, and the "call" code in the Master file would run all 30 dictionaries one after another.

I am also open to other ideas of how to solve this problem if I've tried to create too elegant a solution and made more work for myself.

Thank you for your help!
Kirsten
Are your ID's really numbers, specifically integers? Or are they actually some kind of ID code, like Q165-x or something on such lines? With the data sizes you indicate, if the ID's are really integers, the problem you outline can be done much faster by another way.

If you really want to use the scripting dictionary approach, you can have 30 by all means if you like. But from the data as you present it, only one would be needed.
 
Upvote 0
Are your ID's really numbers, specifically integers? Or are they actually some kind of ID code, like Q165-x or something on such lines? With the data sizes you indicate, if the ID's are really integers, the problem you outline can be done much faster by another way.

If you really want to use the scripting dictionary approach, you can have 30 by all means if you like. But from the data as you present it, only one would be needed.

Yes, the IDs are integers of varying length. I would be happy to hear of a faster approach to this.

And you've piqued my curiosity - how would I create one dictionary with 30 values for one key? Or would I concatenate the key with, say, a source ID?
 
Upvote 0
Yes, the IDs are integers of varying length. I would be happy to hear of a faster approach to this.

And you've piqued my curiosity - how would I create one dictionary with 30 values for one key? Or would I concatenate the key with, say, a source ID?
Integers roughly from 1 to 1 million, or ...?

You wouldn't create one dictionary with 30 values for one key, at least not without a rather purposeless jack-up.

From your data in post#1 it seems the only unique values you require are the ID's in column 1 of the mastersheet, and you only need one dictionary for this. The correct locations of the A, B, C etc are fairly easily determined without additional dictionaries.

These comments rely on the data as you presented it in post#1, which is all the knowledge I have of your data.
 
Upvote 0
I was working on a bit of code that should do what you want to do but I got side tracked with my real work.

Make the dictionary key "Sheet Name"|ID. read through all the sheets except the master and build the dictionary with that as the key. Then you can work down the master sheet and build the key for each column row and fill in the value from the dictionary.
 
Upvote 0
Integers roughly from 1 to 1 million, or ...?

You wouldn't create one dictionary with 30 values for one key, at least not without a rather purposeless jack-up.

From your data in post#1 it seems the only unique values you require are the ID's in column 1 of the mastersheet, and you only need one dictionary for this. The correct locations of the A, B, C etc are fairly easily determined without additional dictionaries.

These comments rely on the data as you presented it in post#1, which is all the knowledge I have of your data.


That isn't quite true. The final master sheet may have more than 1 column filled for any ID. The key needs to be a join of the Sheet Name it came from and the ID.
 
Upvote 0
take a look at this code.
Code:
Sub rosetta()
Dim theSheetName As String
Dim theID As String
Dim theValue As String
Dim theKey As String




Dim currRow As Long
Dim lastRow As Long
Dim currCol As Integer
Dim lastCol As Integer


Dim sh As Worksheet


Dim dic As New Dictionary


'  Not really needed since it was just created.  But I wanted to point out the call
dic.RemoveAll


With Application
    .Calculation = xlCalculationManual
    .ScreenUpdating = False
End With
    




'build the dictionary
For Each sh In ActiveWorkbook.Sheets
    theSheetName = sh.Name
    if theSheetName <> "Master" then
        lastRow = sh.UsedRange.Rows.Count
        For currRow = 2 To lastRow
             theID = sh.Cells(currRow, 1)
             theValue = sh.Cells(currRow, 2)
             theKey = theSheetName & "|" & theID
             If dic.Exists(theKey) Then
                 If theValue < dic.Item(theKey) Then dic.Item(theKey) = theValue
             Else
                 dic.Add theKey, theValue
             End If
        End If
    Next
Next


Set sh = Sheets("Master")


lastRow = sh.UsedRange.Rows.Count
lastCol = sh.UsedRange.Columns.Count


For currRow = 2 To lastRow
    For currCol = 2 To lastCol
        theID = sh.Cells(currRow, 1)
        theValue = sh.Cells(currRow, 2)
        theKey = sh.Cells(1, currCol) & "|" & theID
        If dic.Exists(theKey) Then
            sh.Cells(currRow, currCol) = dic.Item(theKey)
        End If
    Next
Next


With Application
    .Calculation = xlCalculationAutomatic
    .ScreenUpdating = True
End With


End Sub


One thing I did was to tell VBA about the dictionary. on the menu go to tools->references In the "Available References" box find "Microsoft Scripting Runtime" and check it. This will allow the editor to know about the objects and methods. It will be well down in the alphabetic list but once you have used it once it keeps it near the top for future projects.
 
Upvote 0
Kirsten,

For your data as given in post#1, try this macro and see if it gives the results you want (apart from a final sorting on ID's, which is trivial to do).

It supposes 3 source sheets, Source1, Source2 and Source3, with your data as given, and one initially blank mastersheet, Master.

Easily extended to as many source sheets as you like.
Code:
Sub champollion()
Const nsources As Long = 3
Dim d As Object, a, c()
Dim rws As Long, i As Long, k As Long

Set d = CreateObject("scripting.dictionary")
ReDim c(1 To 200000, 1 To nsources + 1)

For j = 1 To nsources
With Sheets("Source" & j)
   rws = .Cells(Rows.Count, 1).End(3).Row
    a = .Cells(1).Resize(rws, 2)
End With

For i = 2 To rws
    If Not d.Exists(a(i, 1)) Then
        k = k + 1
        d(a(i, 1)) = k
        c(k, 1) = a(i, 1)
        c(k, j + 1) = a(i, 2)
    Else
        c(d(a(i, 1)), j + 1) = a(i, 2)
    End If
Next i

Next j
Sheets("Master").Cells(2, 1).Resize(k, nsources + 1) = c

End Sub
 
Last edited:
Upvote 0

Forum statistics

Threads
1,223,099
Messages
6,170,107
Members
452,302
Latest member
TaMere

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top