Excel 2010 VBA - Searching XML using InStr vs Parsing - (result not question)

mchac

Well-known Member
Joined
Apr 15, 2013
Messages
531
If anyone is having Out of Memory errors running procedures that use InStr on large docs, the following may help.

For the past few months, I've been fighting what seems like a memory leak and periodic Out of Memory errors when running a procedure that iteratively searches XML docs. I've posted questions about aspects of this and members of this board have been very helpful.

What seems be a key part of the code is posted below.

Last night I ran an updated procedure that saved the Excel WorkingSet memory use at each iteration and I noticed today that (some) of the XML docs that are not searched correspond with Excel having a high WorkingSet allocation (600+ mb).

Part of the procedure I ran last night uses InStr to search the XML docs. I've also been working on learning parsing so I can use this method to search the docs, instead of using InStr, thinking it might be faster. So the following improved memory usage result was unexpected.

The procedure below has two .Load commands; the first is for a small-ish doc and the second is for a large doc. There are two sections between '++++, to make them easier to identify, that I alternated between to test memory use. The first section (obviously) uses parsing and the second (currently commented out) uses InStr.

The results below are for that second large doc I referred to above.
Note that I'm a rookie at most of this (parsing in particular) so my syntax may be the problem. If anyone can suggest an improvement that will help memory use, I would be grateful.

Code:
                                  WorkingSet Parsing     WorkingSet InStr
Start Excel                          52 mb                        52 mb
Open .xlsm                         149 mb                      149 mb
Open VBA window                 156 mb                     156 mb
.Load                              380 mb                      379 mb
Step Through Max Mem          846 mb **                 385 mb **
Exit Sub                            165 mb                      168 mb

'Step Through Max Mem' means I was stepping through the code, watching Task Manager and noting the memory use.

Obviously this isn't a scientific test. There are things running in background that will impact results, I was watching Task Manager rather than using some memory management software etc etc, but I ran this several times commenting out one of the two methods, saving, exiting excel, starting again and stepping through the procedure. The results were nearly identical so it seems to me the large increase in memory use is tied to InStr.

If anyone has another idea as to what is causing the memory use peak, I'm really interested to hear about it.

Code:
Sub tester3()

    Dim oInstance As MSXML2.DOMDocument60
    Dim n
    Dim o
    
    Dim oNodelist As MSXML2.IXMLDOMNodeList
    Dim m_strUSGAAP_TaxonomyVersion As String
    Dim m_strInvest_TaxonomyVersion As String
    Dim m_strDEI_TaxonomyVersion As String
    
    Set oInstance = New MSXML2.DOMDocument60
    oInstance.setProperty "SelectionLanguage", "XPath"
    oInstance.setProperty "SelectionNamespaces", "xmlns:xsl='http://www.w3.org/1999/XSL/Transform'"
    oInstance.async = False
    oInstance.validateOnParse = False
    'oInstance.Load "http://www.sec.gov/Archives/edgar/data/1143908/000114390813000035/ck0001143908-20130831.xml"
    oInstance.Load "http://www.sec.gov/Archives/edgar/data/7323/000006598413000050/etr-20121231.xml"
    
    m_strContextForInstants = "ERROR"
    m_strContextForDurations = "ERROR"
        
    '++++++++++++++++++++*
    
    Set n = oInstance.namespaces
    
    Do
        For Each o In n
            If o = "http://fasb.org/us-gaap/2013-01-31" Then
                m_strUSGAAP_TaxonomyVersion = "http://fasb.org/us-gaap/2013-01-31"
                m_strInvest_TaxonomyVersion = "http://xbrl.sec.gov/invest/2013-01-31"
                NO_USGAAP_TAXONOMY_INFO = False
                Exit Do
            ElseIf o = "http://fasb.org/us-gaap/2012-01-31" Then
                m_strUSGAAP_TaxonomyVersion = "http://fasb.org/us-gaap/2012-01-31"
                m_strInvest_TaxonomyVersion = "http://xbrl.sec.gov/invest/2012-01-31"
                NO_USGAAP_TAXONOMY_INFO = False
                Exit Do
            ElseIf o = "http://fasb.org/us-gaap/2011-01-31" Then
                m_strUSGAAP_TaxonomyVersion = "http://fasb.org/us-gaap/2011-01-31"
                m_strInvest_TaxonomyVersion = "http://xbrl.sec.gov/invest/2011-01-31"
                NO_USGAAP_TAXONOMY_INFO = False
                Exit Do
            ElseIf o = "http://xbrl.us/us-gaap/2009-01-31" Then
                m_strUSGAAP_TaxonomyVersion = "http://xbrl.us/us-gaap/2009-01-31"
                m_strInvest_TaxonomyVersion = "http://xbrl.us/invest/2019-01-31"
                NO_USGAAP_TAXONOMY_INFO = False
                Exit Do
            Else: NO_USGAAP_TAXONOMY_INFO = True
            End If
        Next o
    Loop While False


    'DEI Taxonomy
    Do
        For Each o In n
            If o = "http://xbrl.sec.gov/dei/2013-01-31" Then
                m_strDEI_TaxonomyVersion = "http://xbrl.sec.gov/dei/2013-01-31"
                NO_DEI_TAXONOMY_INFO = False
                Exit Do
            ElseIf o = "http://xbrl.sec.gov/dei/2012-01-31" Then
                m_strDEI_TaxonomyVersion = "http://xbrl.sec.gov/dei/2012-01-31"
                NO_DEI_TAXONOMY_INFO = False
                Exit Do
            ElseIf o = "http://xbrl.sec.gov/dei/2011-01-31" Then
                m_strDEI_TaxonomyVersion = "http://xbrl.sec.gov/dei/2011-01-31"
                NO_DEI_TAXONOMY_INFO = False
                Exit Do
            ElseIf o = "http://xbrl.us/dei/2009-01-31" Then
                m_strDEI_TaxonomyVersion = "http://xbrl.us/dei/2009-01-31"
                NO_DEI_TAXONOMY_INFO = False
                Exit Do
            Else: NO_DEI_TAXONOMY_INFO = True
            End If
        Next o
    Loop While False


    '+++++++++++++++++++++
    
    'If InStr(1, oInstance.XML, "http://fasb.org/us-gaap/2013-01-31") <> 0 Then
        'This IS the 2012 US GAAP taxonomy
    '    m_strUSGAAP_TaxonomyVersion = "http://fasb.org/us-gaap/2013-01-31"
    '    m_strInvest_TaxonomyVersion = "http://xbrl.sec.gov/invest/2013-01-31"  'NEED TO CONFIRM THE DATE HERE
    'ElseIf InStr(1, oInstance.XML, "http://fasb.org/us-gaap/2012-01-31") <> 0 Then
        'This IS the 2012 US GAAP taxonomy
    '    m_strUSGAAP_TaxonomyVersion = "http://fasb.org/us-gaap/2012-01-31"
    '    m_strInvest_TaxonomyVersion = "http://xbrl.sec.gov/invest/2012-01-31"
    'ElseIf InStr(1, oInstance.XML, "http://fasb.org/us-gaap/2011-01-31") <> 0 Then
        'This IS the 2011 US GAAP taxonomy
    '    m_strUSGAAP_TaxonomyVersion = "http://fasb.org/us-gaap/2011-01-31"
    '    m_strInvest_TaxonomyVersion = "http://xbrl.sec.gov/invest/2011-01-31"
    'ElseIf InStr(1, oInstance.XML, "http://xbrl.us/us-gaap/2009-01-31") <> 0 Then
        'This IS the 2009 US GAAP taxonomy
    '    m_strUSGAAP_TaxonomyVersion = "http://xbrl.us/us-gaap/2009-01-31"
    '    m_strInvest_TaxonomyVersion = "http://xbrl.us/invest/2019-01-31"
    'Else: NO_USGAAP_TAXONOMY_INFO = True
    'End If
    
    'DEI Taxonomy
    'If InStr(1, oInstance.XML, "http://xbrl.sec.gov/dei/2013-01-31") <> 0 Then
        'This IS the 2012 DEI taxonomy
    '    m_strDEI_TaxonomyVersion = "http://xbrl.sec.gov/dei/2013-01-31"
    'ElseIf InStr(1, oInstance.XML, "http://xbrl.sec.gov/dei/2012-01-31") <> 0 Then
        'This IS the 2012 DEI taxonomy
    '    m_strDEI_TaxonomyVersion = "http://xbrl.sec.gov/dei/2012-01-31"
    'ElseIf InStr(1, oInstance.XML, "http://xbrl.sec.gov/dei/2011-01-31") <> 0 Then
        'This IS the 2011 DEI taxonomy
    '    m_strDEI_TaxonomyVersion = "http://xbrl.sec.gov/dei/2011-01-31"
    'ElseIf InStr(1, oInstance.XML, "http://xbrl.us/dei/2009-01-31") <> 0 Then
       'This IS the 2009 DEI taxonomy
    '     m_strDEI_TaxonomyVersion = "http://xbrl.us/dei/2009-01-31"
    'Else: NO_DEI_TAXONOMY_INFO = True
    'End If
    
    '+++++++++++++++++++++
    
    If NO_USGAAP_TAXONOMY_INFO = True Or NO_DEI_TAXONOMY_INFO = True Then
        Exit Sub
    End If
    
    oInstance.setProperty "SelectionNamespaces", "xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns:xbrli='http://www.xbrl.org/2003/instance' xmlns='http://www.xbrl.org/2003/instance' xmlns:xbrldi='http://xbrl.org/2006/xbrldi' xmlns:us-gaap='" & m_strUSGAAP_TaxonomyVersion & "' xmlns:dei='" & m_strDEI_TaxonomyVersion & "' xmlns:currency='http://xbrl.sec.gov/currency/2012-01-31' xmlns:invest='" & m_strInvest_TaxonomyVersion & "'"
                
    Set oNodelist = oInstance.SelectNodes("//dei:EntityRegistrantName")
    
    If oNodelist.Length > 0 Then
        Debug.Print "found it: " & oInstance.SelectSingleNode("//dei:EntityRegistrantName").Text
    Else
        Debug.Print "didn't find it"
    End If


End Sub
 
Last edited:

Excel Facts

Convert text numbers to real numbers
Select a column containing text numbers. Press Alt+D E F to quickly convert text to numbers. Faster than "Convert to Number"

Forum statistics

Threads
1,214,392
Messages
6,119,257
Members
448,880
Latest member
aveternik

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top