Retrieving and matching data from a VBA array

dgms89

New Member
Joined
Jan 3, 2017
Messages
10
Greetings,

I am trying to match the data in multiple files using arrays (because vlookup takes too long).
For example: I have 100 source files (workbooks) each having the information for 1 day. Each file has a number of items and each item have multiple information columns, as of a normal database.
<code style="margin: 0px; padding: 0px; border: 0px; font-family: Consolas, Menlo, Monaco, "Lucida Console", "Liberation Mono", "DejaVu Sans Mono", "Bitstream Vera Sans Mono", "Courier New", monospace, sans-serif; white-space: inherit;">Workbook 1 - 20 of January
Item
(identifier) Price Price2 Price 3
A
100 5 10
B
80 6 15
C
110 3 15
Workbook
2 - 21 of January
Item
(identifier) Price Price2 Price 3
A
95 3 5
B
90 6 10
C
120 4 20
D
130 7 2

Workbook
3 - 22 of January
Item
(identifier) Price Price2 Price 3
A
70 11 35
B
100 1 15
D
45 17 25</code>There are around 20 price columns, and thousands of items in each day (workbook). New identifiers might be added or removed in new days. My code gets this data and uses vlookup to match, as it adds new source files (if a file does not have an item that was there in the past, the information simply counts as zero).
My output in excel is:
<code style="margin: 0px; padding: 0px; border: 0px; font-family: Consolas, Menlo, Monaco, "Lucida Console", "Liberation Mono", "DejaVu Sans Mono", "Bitstream Vera Sans Mono", "Courier New", monospace, sans-serif; white-space: inherit;">Tab (Worksheet) 1 (price)
All the item Identifiers
20/01 21/01 22/01
A
100 95 70
B
80 90 100
C
3 4 -
D
- 130 45

Tab (Worksheet)
2 (price2)
All the item Identifiers
20/01 21/01 22/01
A
5 3 11
B
6 6 1
C
110 120 -
D
- 2 25</code>And so on.
First problem is how to compare the identifiers in each file, and add the new ones to the end of the output list (each time a new day workbook is included).
Second is that, just doing the current procedure, the code takes too long (more than 20 hours). If I transform this information (each file separately) to an array, how can I search the information inside the array, match it and past it to the output file? Is there a faster option than vlookup?

If not, is it possible to use it to get multiple results are once? (instead of running 1 lookup for each column of each identifier, run i lookup for each identifier only and return all the columns at once, since they are always the same)

Any help will be deeply appreciated.
 

Excel Facts

When did Power Query debut in Excel?
Although it was an add-in in Excel 2010 & Excel 2013, Power Query became a part of Excel in 2016, in Data, Get & Transform Data.
Since I did not find how to edit the original post, I fixed some information here:
Input:
Workbook 1 (20 of January)
Item (Identifier) Price1 Price2 Price3
A 100 5 10
B 80 6 15
C 110 3 15

Workbook 2 (21 of January)
Item (Identifier) Price1 Price2 Price3
A 95 3 5
B 90 6 10
C 120 4 20
D 130 7 20

Workbook 3 (22 of January)
Item (Identifier) Price1 Price2 Price3
A 70 11 35
B 100 11 5
D 45 17 25

<code style="margin: 0px; padding: 0px; font-family: Consolas, Menlo, Monaco, "Lucida Console", "Liberation Mono", "DejaVu Sans Mono", "Bitstream Vera Sans Mono", "Courier New", monospace, sans-serif; line-height: 13px; color: rgb(51, 51, 51); white-space: inherit; background-color: rgb(250, 250, 250); border: 0px;">Output:
Tab 1 (worksheet) - Price:
All Items (Identifiers) 20/01 21/01 22/01:
A 100 95 70
B 80 90 100
C 110 120 -
D - 130 45

Tab 2 (worksheet) - Price2:
All Items (Identifiers) 20/01 21/01 22/01:
A 5 3 11
B 6 6 11
C 3 4 -
D - 7 17
</code>
 
Upvote 0
Hi,
Basically you need to collect all identificators from all files (A,B,C,D...) and calculate each Price for each day.
Why don't you put all data in one file and then operate with whole data base? Date could be inputed in seperate column like:
Date Item Price Price1 ...
20.01.2016 A 100 50
20.01.2016 B 20 60
21.01.2016 A 120 70
...

This format will allow you to load all data in array and process them really faster than 20h.
 
Upvote 0
Hi,
Basically you need to collect all identificators from all files (A,B,C,D...) and calculate each Price for each day.
Why don't you put all data in one file and then operate with whole data base? Date could be inputed in seperate column like:
Date Item Price Price1 ...
20.01.2016 A 100 50
20.01.2016 B 20 60
21.01.2016 A 120 70
...

This format will allow you to load all data in array and process them really faster than 20h.


Thanks for the Answer.
I first tried to do that, but still, copying all the information to one single file requires too much time.
Apparently that gives me two options: Or I gather all the files into one array and work on that, or I get file by file into one array, and use the identifiers of each file to create my Identifier Array (which increase every time I add a new file/workbook/day).

For the first case, I would have to many redundancies, which I believe would just increase the run time.
That is why I was aiming for the second option, but I am having trouble manipulating both arrays and matching the information between them to output to the worksheet.
 
Upvote 0
Can you show how you are trying to collect all data to array? Do you need to do this every day?
Copy all info from all files every day - requires much time, but what if you do this exercise now and every day just add info for one day? This will allow you to keep data for each day in separete files and you will have common data base to work with.
Other option could be SQL query from all files, this might be faster way to get all data in one sheet.
 
Upvote 0
Can you show how you are trying to collect all data to array? Do you need to do this every day?
Copy all info from all files every day - requires much time, but what if you do this exercise now and every day just add info for one day? This will allow you to keep data for each day in separete files and you will have common data base to work with.
Other option could be SQL query from all files, this might be faster way to get all data in one sheet.

I have to redo the whole thing every day, as new source files come out (and the identifiers can change).
my original code does everything with vlookup (but it takes to long, because there are many items and many source files):

<code style="margin: 0px; padding: 0px; border: 0px; font-family: Consolas, Menlo, Monaco, "Lucida Console", "Liberation Mono", "DejaVu Sans Mono", "Bitstream Vera Sans Mono", "Courier New", monospace, sans-serif; white-space: inherit;">Sub Info()
Dim w As Workbook
Set w = ThisWorkbook
Dim w2 As Workbook
Dim d As Date
Dim Yesterday As Date
Dim a As Long, Z As Long, Y As Long, c As Long, end As Long, end2 As Long, i As Long, b As Long

a
= 4
Z
= 7
Y
= 2
c
= 0

end = ThisWorkbook.Worksheets(1).UsedRange.Rows.count

Dim MyFolder As String
Dim MyFile As String

'Optimize Macro Speed Start
Application
.ScreenUpdating = False
Application
.EnableEvents = False
Application
.Calculation = xlCalculationManual

For i = 2 To ThisWorkbook.Sheets("FILES").Cells(1, 2).Value

Workbooks
.Open Filename:=ThisWorkbook.path & "" & ThisWorkbook.Sheets("FILES").Cells(i, 1).Value

Set w2 = ActiveWorkbook
ActiveSheet
.Range("A:A").Select

'source file is all in one column, this fix it, transforming text to columns

Selection
.TextToColumns destination:=Range("A1"), DataType:=xlDelimited, _
TextQualifier
:=xlDoubleQuote, ConsecutiveDelimiter:=False, Tab:=True, _
Semicolon
:=True, Comma:=True, Space:=False, Other:=False, FieldInfo:= _
Array
(Array(1, 1), Array(2, 1), Array(3, 1), Array(4, 1), Array(5, 1), Array(6, 1), Array(7 _
, 1), Array(8, 1), Array(9, 1), Array(10, 1), Array(11, 1), Array(12, 1), Array(13, 1), Array(14, 1), Array(15, 1), Array(16, 1), Array(17 _
, 1), Array(18, 1), Array(19, 1), Array(20, 1), Array(21, 1), Array(22, 1), Array(23, 1), Array(24, 1), Array(25, 1), Array(26, 1), Array(27 _
, 1), Array(28, 1), Array(29, 1), Array(30, 1), Array(31, 1), Array(32, 1)), TrailingMinusNumbers:=True

end2
= ActiveSheet.UsedRange.Rows.count

Do While Z <= end2
Do While Y <= end And c = 0
If w2.Worksheets(1).Cells(Z, 1) = w.Worksheets(1).Cells(Y, 1) Then
c
= 1
End If
Y
= Y + 1
Loop
If c = 0 Then
w
.Worksheets(1).Cells(end + 1, 1) = w2.Worksheets(1).Cells(Z, 1)
end = end + 1
End If
Y
= 2
c
= 0
Z
= Z + 1
Loop
Z
= 7

end = ThisWorkbook.Worksheets(1).UsedRange.Rows.count

b
= 2
w
.Worksheets(1).Cells(1, a) = w2.Worksheets(1).Cells(1, 2)
While b <= end
'Vlookup the identifiers within each file and retrieve the desired data based on the specified column
'data1
w
.Worksheets(1).Cells(b, a) = Application.IfError(Application.VLookup(w.Worksheets(1).Cells(b, 1).Value, w2.Sheets(1).Range("A:AA"), 12, 0), 0)
'data2
w
.Worksheets(2).Cells(b, a) = Application.IfError(Application.VLookup(w.Worksheets(1).Cells(b, 1).Value, w2.Sheets(1).Range("A:AA"), 11, 0), 0)
'data3
w
.Worksheets(3).Cells(b, a) = Application.IfError(Application.VLookup(w.Worksheets(1).Cells(b, 1).Value, w2.Sheets(1).Range("A:AA"), 27, 0), 0)
'data4
w
.Worksheets(4).Cells(b, a) = Application.IfError(Application.VLookup(w.Worksheets(1).Cells(b, 1).Value, w2.Sheets(1).Range("A:AA"), 16, 0), 0)
'data5
w
.Worksheets(5).Cells(b, a) = Application.IfError(Application.VLookup(w.Worksheets(1).Cells(b, 1).Value, w2.Sheets(1).Range("A:AA"), 20, 0), 0)
'data6
w
.Worksheets(6).Cells(b, a) = Application.IfError(Application.VLookup(w.Worksheets(1).Cells(b, 1).Value, w2.Sheets(1).Range("A:AA"), 22, 0), 0)
'data7
w
.Worksheets(7).Cells(b, a) = Application.IfError(Application.VLookup(w.Worksheets(1).Cells(b, 1).Value, w2.Sheets(1).Range("A:AA"), 9, 0), 0)
'data8
w
.Worksheets(8).Cells(b, a) = Application.IfError(Application.VLookup(w.Worksheets(1).Cells(b, 1).Value, w2.Sheets(1).Range("A:AA"), 10, 0), 0)
'data9
w
.Worksheets(9).Cells(b, a) = Application.IfError(Application.VLookup(w.Worksheets(1).Cells(b, 1).Value, w2.Sheets(1).Range("A:AA"), 17, 0), 0)
'data10
w
.Worksheets(10).Cells(b, a) = Application.IfError(Application.VLookup(w.Worksheets(1).Cells(b, 1).Value, w2.Sheets(1).Range("A:AA"), 18, 0), 0)
'data11
w
.Worksheets(11).Cells(b, a) = Application.IfError(Application.VLookup(w.Worksheets(1).Cells(b, 1).Value, w2.Sheets(1).Range("A:AA"), 19, 0), 0)
'data12
w
.Worksheets(12).Cells(b, a) = Application.IfError(Application.VLookup(w.Worksheets(1).Cells(b, 1).Value, w2.Sheets(1).Range("A:AA"), 21, 0), 0)
'data13
w
.Worksheets(13).Cells(b, a) = Application.IfError(Application.VLookup(w.Worksheets(1).Cells(b, 1).Value, w2.Sheets(1).Range("A:AA"), 24, 0), 0)

b
= b + 1
Wend
w2
.Close False
a
= a + 1
'MyFile = Dir

Next
'Optimize Macro Speed
Application
.ScreenUpdating = True
Application
.EnableEvents = True
Application
.Calculation = xlCalculationAutomatic

'Close file and save
'w.Close True

End Sub</code>
 
Upvote 0
Your code is so slow because of too many references to cells on a sheet.
I'll try to present my idea how to process data.
I would suggest to devide process into 2 parts: first is data collection and second is data processing.
I've created simplified version of file to process data, to get this sample work:
1) download zip file https://drive.google.com/open?id=0B5PLkgMtqYCUOUJzb1BNbFRlaVE
2) place folder "abc" in disk C: (path should be C:\abc\)
3) Open CombineData.xlsm
4) Press Collect data
5) Press Reports

1) Data collection
Code below will collect data from all files in folder C:\abc\ (marked red in code). You can change this to your folder.
Code will go throw all files and copy range A2:Z10000 from Sheet1, if name of sheets or range are different - this should be changed for real data.
Main point of this - to get data from all files and put them into one file.
In attached file button CollectData is binded to this code.
Code:
Sub CollateFromFiles()
Dim strFileName As String, strPath As String, MyVal As String
Dim wbkOld As Workbook, wbkNew As Workbook, ws As Worksheet
Dim NR As Long, LR, j As Long
Application.EnableEvents = False
Application.DisplayAlerts = False
Set wbkNew = ThisWorkbook
strPath = "[COLOR=#ff0000]C:\abc\[/COLOR]" 'folder with all files
strFileName = Dir(strPath & "*.xlsx")
wbkNew.Activate
NR = 1
'Collate data from each file in the designated folder
    Do While Len(strFileName) > 0
        Set wbkOld = Workbooks.Open(strPath & strFileName) 'open file from folder
        LR = ActiveSheet.Range("A" & Rows.Count).End(xlUp).Row  'calculates last row in sheet Report
        wbkOld.[COLOR=#ff0000]Worksheets("Sheet1").Range("A2:Z10000")[/COLOR].Copy 'copy data from just opened file / ranges mught be adjusted
        wbkNew.Sheets("Report").Range("B" & NR).PasteSpecial xlValues
        
        'insert name of current source file
        For j = NR To wbkNew.Sheets("Report").Range("B" & Rows.Count).End(xlUp).Row
            wbkNew.Sheets("Report").Range("A" & j) = wbkOld.Name
        Next
        
        NR = NR + LR - 1
        strFileName = Dir
        wbkOld.Close False
         
    Loop
'Cleanup
   Application.DisplayAlerts = True
    Application.EnableEvents = True
End Sub

2) Data processing
After all data is collected into single sheet - we can load them ingo VBA array and make all manipulations with array (not cells).
Code below will loop throw each Price, combine data and print them out in new sheet.
Code:
Sub Reports()
Dim arr
Dim res() As Variant
Dim dID, dDate
Dim PriceCount, LR, i, j, k, z As Long
Dim ws As Worksheet
'calculate last row & last column
LR = Sheets("Report").Range("A" & Rows.Count).End(xlUp).Row
PriceCount = Sheets("Report").Range("A1").CurrentRegion.Columns.Count
'place data in VBA array
arr = Sheets("Report").Range(Cells(1, 1), Cells(LR, PriceCount))
'create list of unique identifiers & list of unique dates
Set dID = CreateObject("Scripting.Dictionary")
Set dDate = CreateObject("Scripting.Dictionary")
j = 1
With dID
    For i = 1 To UBound(arr) 'loop throw all array rows and collect unique indentifiers ot dictionary dID
        If Not .Exists(arr(i, 2)) Then
            .Add arr(i, 2), j
            j = j + 1
        End If
    Next
End With
With dDate
    For i = 1 To UBound(arr) 'loop throw all array rows and collect unique dates in dictionary dDate
        If Not .Exists(arr(i, 1)) Then .Add arr(i, 1), 0
    Next
End With
'Processing all reports
For i = 3 To PriceCount  'loop for each Price, starting from 3 because column C should contain Price, column 4 contains Price1 ...
  ReDim res(1 To dID.Count + 1, 1 To dDate.Count + 1)  'make array for results of current Price
      For k = 0 To dDate.Count - 1   'loop throw all dates
            res(1, k + 2) = dDate.Keys()(k)   'write date in result array
          For z = 0 To dID.Count - 1  'loop throw all items
             res(z + 2, 1) = dID.Keys()(z) 'write item in result array
                For j = 1 To UBound(arr)  'loop throw all data
                   If arr(j, 1) = dDate.Keys()(k) And arr(j, 2) = dID.Keys()(z) Then  'if in full array of data we found row with Date and Item then write it ot result array
                        res(z + 2, k + 2) = res(z + 2, k + 2) + arr(j, i) 'this sum all recods for same Date and Item (if there are many A items in 20.01.2016 - they all will be sumarized)
                   End If
                Next
          Next
        Next
'add sheet and paste results
    
    Set ws = Sheets.Add(After:=Sheets(Worksheets.Count)) 'create new sheet
    ws.Name = i  'rename sheet
       
    ws.Range("A1").Resize(dID.Count + 1, dDate.Count + 1) = res  'paste result array to a just created sheet
    Erase res 'delete result sheet
Next  ' take next Price
End Sub
 
Upvote 0
I managed to build this code, with a stated IsInArray function (Which is currently giving me a Subscript out of Range error)

Code:
Sub Price()
    Dim w As Workbook
    Set w = ThisWorkbook
    Dim w2 As Workbook
    Dim end1 As Long, end2 As Long, i As Long, lRow As Long, lColumn As Long, t As Long, k As Long, position As Long, g As Long
    Dim WBArray() As Variant
    Dim IS() As Variant
    Dim ws As Worksheet
    
    end1 = ThisWorkbook.Worksheets(1).UsedRange.Rows.count


    Dim MyFolder As String
    Dim MyFile As String


'Optimize Macro Speed Start
    Application.ScreenUpdating = False
    Application.EnableEvents = False
    Application.Calculation = xlCalculationManual
    
'opens the first workbook file
    For i = 2 To ThisWorkbook.Sheets("FILES").Cells(1, 2).Value


        Workbooks.Open Filename:=ThisWorkbook.path & "\" & ThisWorkbook.Sheets("FILES").Cells(i, 1).Value


        Set w2 = ActiveWorkbook
        ActiveSheet.Range("A:A").Select
        
        'text to columns
        Selection.TextToColumns destination:=Range("A1"), DataType:=xlDelimited, TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter:=False, Tab:=True, _
            Semicolon:=True, Comma:=True, Space:=False, Other:=False, FieldInfo:=Array(Array(1, 1), Array(2, 1), Array(3, 1), Array(4, 1), Array(5, 1), Array(6, 1), Array(7 _
            , 1), Array(8, 1), Array(9, 1), Array(10, 1), Array(11, 1), Array(12, 1), Array(13, 1), Array(14, 1), Array(15, 1), Array(16, 1), Array(17 _
            , 1), Array(18, 1), Array(19, 1), Array(20, 1), Array(21, 1), Array(22, 1), Array(23, 1), Array(24, 1), Array(25, 1), Array(26, 1), Array(27 _
            , 1), Array(28, 1), Array(29, 1), Array(30, 1), Array(31, 1), Array(32, 1)), TrailingMinusNumbers:=True


        end2 = ActiveSheet.UsedRange.Rows.count


'transform it to array
        WBArray = ActiveSheet.Range(Cells(5, 1), Cells(end2, 29)).Value


'loop to match information in two arrays


        For lRow = 2 To UBound(WBArray)
            If IsInArray((WBArray(lRow, 1)), IS) <> -1 Then
                t = IsInArray((WBArray(lRow, 1)), IS)
             
'start the information pasting procedure:
                w.Sheets("Cpn").Cell(t, i + 3) = WBArray(lRow, 11)
                w.Sheets("Mat").Cell(t, i + 3) = WBArray(lRow, 12)
                w.Sheets("Weight t-1").Cell(t, i + 3) = WBArray(lRow, 13)
                w.Sheets("Price").Cell(t, i + 3) = WBArray(lRow, 14)
                w.Sheets("Acc").Cell(t, i + 3) = WBArray(lRow, 15)
                w.Sheets("PCash").Cell(t, i + 3) = WBArray(lRow, 16)
                w.Sheets("AMNT").Cell(t, i + 3) = WBArray(lRow, 17)
                w.Sheets("AMNT t-1").Cell(t, i + 3) = WBArray(lRow, 18)
                w.Sheets("Price t-1").Cell(t, i + 3) = WBArray(lRow, 19)
                w.Sheets("FX").Cell(t, i + 3) = WBArray(lRow, 20)
                w.Sheets("FX t-1").Cell(t, i + 3) = WBArray(lRow, 21)
                w.Sheets("Acc t-1").Cell(t, i + 3) = WBArray(lRow, 22)
                w.Sheets("SINK").Cell(t, i + 3) = WBArray(lRow, 23)


            Else
            
'add it to the end of ISArray
                ReDim Preserve IS(1 To UBound(IS) + 1)
                IS(UBound(IS)) = WBArray(lRow, 1)
                k = UBound(IS)
                
                w.Sheets("Cpn").Cell(k, i + 3) = WBArray(lRow, 11)
                w.Sheets("Mat").Cell(k, i + 3) = WBArray(lRow, 12)
                w.Sheets("Weight t-1").Cell(k, i + 3) = WBArray(lRow, 13)
                w.Sheets("Price").Cell(k, i + 3) = WBArray(lRow, 14)
                w.Sheets("Acc").Cell(k, i + 3) = WBArray(lRow, 15)
                w.Sheets("PCash").Cell(k, i + 3) = WBArray(lRow, 16)
                w.Sheets("AMNT").Cell(k, i + 3) = WBArray(lRow, 17)
                w.Sheets("AMNT t-1").Cell(k, i + 3) = WBArray(lRow, 18)
                w.Sheets("Price t-1").Cell(k, i + 3) = WBArray(lRow, 19)
                w.Sheets("FX").Cell(k, i + 3) = WBArray(lRow, 20)
                w.Sheets("FX t-1").Cell(k, i + 3) = WBArray(lRow, 21)
                w.Sheets("Acc t-1").Cell(k, i + 3) = WBArray(lRow, 22)
                w.Sheets("SINK").Cell(k, i + 3) = WBArray(lRow, 23)


            End If
      
      Next lRow
'copy the file date from each source workbook to output workbook
'if the control sheet name (FILES) is changed, please change it in this loop
            
            For Each ws In w.Worksheets
                
                If ws.Name <> "FILES" Then
                    
                    ws.Cells(1, i + 3) = w2.Worksheets(1).Cells(1, 2)
        
                End If
                
            Next ws
   
    Next i


'paste the isin array to all worksheets
   
        g = UBound(IS)
        For Each ws In ActiveWorkbook.Worksheets
         
            Range("A1:A" & g) = IS()
            
        Next ws


'Optimize Macro Speed
    Application.ScreenUpdating = True
    Application.EnableEvents = True
    Application.Calculation = xlCalculationAutomatic


'Close file and save
    'w.Close True




End Sub




     
Function IsInArray(stringToBeFound As String, arr As Variant) As Long
    Dim position As Long
    'default return value if value not found in array
    IsInArray = -1


    For position = LBound(arr) To UBound(arr) 'subscript out of range
        If arr(position) = stringToBeFound Then
            IsInArray = position + 1
            Exit For
        End If
    Next
    
End Function
 
Upvote 0

Forum statistics

Threads
1,214,646
Messages
6,120,716
Members
448,985
Latest member
chocbudda

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top