Performance: What is the best "method" to pull data from Several WorkBooks using VBA

Luthius

Active Member
Joined
Apr 5, 2011
Messages
274
Guys
I know that there is more than one way to skin a cat, but what is the best way to pull down data from more than 01 workbook taking into consideration the performance of the code?


  1. Pull data by a recordset?
  2. Pull data by Copying range
  3. Another Method/Approach?

The codes below I developed as a matter of testing its performance and was developed to pull data of 01 Workbook only. I am disappointed with the results in both codes.

The Source Workbook contains less than 900k records along (43587 rows and 22 Columns). In my opinion it is not a huge number of records nowadays.

Does anyone know some better solution that gives a "faster" response?

Code:
[COLOR=#0000ff]Public Const[/COLOR] strDataSource [COLOR=#0000ff]As String[/COLOR] = "C:\Users\luthius\Documents\MyData.xlsx"
    [COLOR=#008000]'By Recordset[/COLOR]
[COLOR=#0000ff]Sub[/COLOR] PullDataByRecordset()
    [COLOR=#008000]'Reference to Microsoft ActiveX Data Objects x.x Library[/COLOR]
    [COLOR=#0000ff]Dim[/COLOR] wbADOCn [COLOR=#0000ff]As New[/COLOR] ADODB.Connection
    [COLOR=#0000ff]Dim[/COLOR] strCn [COLOR=#0000ff]As[/COLOR] String
    [COLOR=#0000ff]Dim[/COLOR] rst [COLOR=#0000ff]As [/COLOR][COLOR=#0000ff]New[/COLOR] ADODB.Recordset
    [COLOR=#0000ff]Dim[/COLOR] strSQL [COLOR=#0000ff]As[/COLOR] String
    [COLOR=#0000ff]Dim [/COLOR]wbDestination [COLOR=#0000ff]As[/COLOR] Workbook


    [COLOR=#0000ff]Set[/COLOR] wbDestination = ActiveWorkbook

    strCn = "Provider=Microsoft.ACE.OLEDB.12.0;" & _
            "Data Source=" & strDataSource & _
            ";Extended Properties=""Excel 12.0 Xml;HDR=Yes"";"
    strSQL = "Select * from [Sheet1$]"

    wbADOCn.Open strCn
    rst.Open strSQL, wbADOCn

    wbDestination.Sheets("Sheet2").Range("A2").CopyFromRecordset rst

    rst.Close
    wbADOCn.Close


    [COLOR=#0000ff]Set [/COLOR]wbADOCn = [COLOR=#0000ff]Nothing[/COLOR]
    [COLOR=#0000ff]Set[/COLOR] rst = [COLOR=#0000ff]Nothing[/COLOR]
[COLOR=#0000ff]End Sub[/COLOR]

    [COLOR=#008000]'By Copying range[/COLOR]
[COLOR=#0000ff]Sub[/COLOR] pullDataByCopyingRange()
    [COLOR=#0000ff]Dim[/COLOR] wbSource [COLOR=#0000ff]As[/COLOR] Workbook
    [COLOR=#0000ff]Dim[/COLOR] rngSource [COLOR=#0000ff]As[/COLOR] Range
    [COLOR=#0000ff]Dim [/COLOR]wbDestination [COLOR=#0000ff]As[/COLOR] Workbook

    [COLOR=#0000ff]Set[/COLOR] wbDestination = ActiveWorkbook
    [COLOR=#0000ff]Set[/COLOR] wbSource = Workbooks.Open(strDataSource)
    [COLOR=#0000ff]Set[/COLOR] rngSource = wbSource.Worksheets("Sheet1").UsedRange

    rngSource.Offset(1).Copy

    wbDestination.Sheets("Sheet2").Range("A2").PasteSpecial Paste:=xlPasteValues

    wbSource.Close [COLOR=#0000ff]False[/COLOR]
    [COLOR=#0000ff]Set[/COLOR] wbSource = [COLOR=#0000ff]Nothing[/COLOR]
    [COLOR=#0000ff]Set[/COLOR] rngSource = [COLOR=#0000ff]Nothing[/COLOR]
   [COLOR=#0000ff] Set[/COLOR] wbDestination = [COLOR=#0000ff]Nothing[/COLOR]
[COLOR=#0000ff]End Sub[/COLOR]

    [COLOR=#008000]' Or by sweeping data using some array or another approach?[/COLOR]
[COLOR=#0000ff]Sub[/COLOR] pullDataByArray()
[COLOR=#008000]'????[/COLOR]
[COLOR=#0000ff]End Sub[/COLOR]
Ps.: The DisplayAlerts/ScreenUpdate/Calculation were not turned off because I wanted to study the behaviour/performance of the code - "Raw code".
 
Last edited:

Some videos you may like

Excel Facts

Round to nearest half hour?
Use =MROUND(A2,"0:30") to round to nearest half hour. Use =CEILING(A2,"0:30") to round to next half hour.

Fluff

MrExcel MVP, Moderator
Joined
Jun 12, 2014
Messages
40,837
Office Version
365
Platform
Windows
How long did those methods take?
What version of Xl are you running?
 

Luthius

Active Member
Joined
Apr 5, 2011
Messages
274
Excel Version: 14 64-bit
Windows: 7 Enterprise
1st Code:18s
2nd Code: 16s

Using Matlab: 0.8s
 

Fluff

MrExcel MVP, Moderator
Joined
Jun 12, 2014
Messages
40,837
Office Version
365
Platform
Windows
The 2nd code only took me 3s for a sheet with 46710 rows & 32 columns

Have you looked into PowerQuery?
 

MARK858

MrExcel MVP
Joined
Nov 12, 2010
Messages
12,679
Office Version
365, 2010
Platform
Windows, Mobile
Using Matlab: 0.8s
Ps.: The DisplayAlerts/ScreenUpdate/Calculation were not turned off because I wanted to study the behaviour/performance of the code - "Raw code".
You are comparing it to Matlab but you don't state how you are importing it with Matlab, do you know that DisplayAlerts/ScreenUpdate/Calculation aren't turned off with the method you are using with Matlab for the comparison?

PS. obviously see Fluff's suggestion with Power Query.
 

Luthius

Active Member
Joined
Apr 5, 2011
Messages
274
Interesting performance.

Well, what about several workbooks (100, 1000, +).
Using the method of Open/Close workbooks causes a crash. I tested it, by looping through 20 Workbooks with the same amount of data.

I was wondering what is the most reliable approach when is necessary to work with more than 10 files?!
What would be the alternative for the codes I presented (RecordSet/ Range)??

Related to Matlab, the main purpose of this post is to develop the most reliable and fast code when pulling data from different files using excel. The comparison was not actually "fair" due different technologies. Nevermind about Matlab

Ps.:I'm using excel 2010. Powerquery can be a interesting idea, but will be necessary to buy a new version of Excel, and it is not in my plans/budget.
 
Last edited:

Fluff

MrExcel MVP, Moderator
Joined
Jun 12, 2014
Messages
40,837
Office Version
365
Platform
Windows
As far as I know, the PowerQuery add-in works with 2010
 

MARK858

MrExcel MVP
Joined
Nov 12, 2010
Messages
12,679
Office Version
365, 2010
Platform
Windows, Mobile
As far as I know, the PowerQuery add-in works with 2010
Only the professional addition officially (although I did get it to work for a short time with Home/Sudent).
 

Fluff

MrExcel MVP, Moderator
Joined
Jun 12, 2014
Messages
40,837
Office Version
365
Platform
Windows
Thanks for that Mark, didn't realise it was limited in that way.
 

Luthius

Active Member
Joined
Apr 5, 2011
Messages
274
My purpose is understand the limitations and also the pros and cons.
For instance, we know that Collection is faster than Dictionary. But Dictionary has more features that Collection. So we can choose what is the best as per need and per limitation as well.

If you guys can give more suggestion to pull a big amount of data (not only for a single file/source) taking into consideration performance and reliability.

Thanks for your comments so far. I really appreciate it.
 
Last edited:

Watch MrExcel Video

Forum statistics

Threads
1,099,317
Messages
5,467,895
Members
406,558
Latest member
MattJC7

This Week's Hot Topics

Top