VBA - Stream very large (GBs) single line (no crlf, cr, lf) text / xml file and modify same file.

KayTannee

New Member
Joined
Jan 11, 2012
Messages
23
Hi,

I'm trying to stream through a text file and remove a whole tag from the XML.
Code:
"<" & "Tagname */><tagname *="">"
. (Ampersand just there as forum deletes the tag)
<tagname *=""><tagname *="">The issue is the text/xml files can be anywhere from a few KBs to multiple GBs. The file also has no vbcrlf, vbcr or vblf in them.

Code:
Sub FixXML(FileName As String, TagName as string)
Dim FSO As Object, TxtS As Object
Dim OrigFileContents As String, NewFileContents As String
Dim StrtID As Long
Dim FrontBit As String, EndBit As String


Set FSO = CreateObject("Scripting.FileSystemObject")
Set TxtS = FSO.OpenTextFile(FileName)
OrigFileContents = TxtS.ReadAll


StrtID = InStr(1, OrigFileContents, "<" & TagName & " ", vbTextCompare)

If StrtID > 0 Then
    FrontBit = Left(OrigFileContents, StrtID - 1)
    EndBit = Mid(OrigFileContents, InStr(StrtID, OrigFileContents, "/>") + 2)
    NewFileContents = FrontBit & EndBit


    If OrigFileContents <> NewFileContents Then
        Set TxtS = FSO.OpenTextFile(FileName)
        TxtS.write NewFileContents
        TxtS.Close
        tCounter = tCounter + 1
    End If
End If


End Sub

This works great using a FileSystemObject to just ReadAll, modify the string and then write the whole string back to the same file. However if the file gets large enough then I get an out of memory error trying to load the string.

The below is currently method I'm messing around with. For it to work though, I'd probably need to be able to
a) Stream through by character backwards.
b) Be able to modify/snipout specific characters

Code:
Set FSO = CreateObject("Scripting.FileSystemObject")
Set TxtS = FSO.OpenTextFile(FileName)
Do Until TxtS.AtEndOfStream
    OrigFileContents = TxtS.Read(10000)
    'Do some modifying, being careful if tag looking for is split over 2 blocks.
    'TxtS.write(10000) = OrigFileContents  'This obviously doesn't work :( .....
Loop

I'd ideally like to be able to just make the snip in the file quickly, modifying the original file. The only solution I can currently think of would be a slow and disk space intensive Read one file, and append to another... Creating 2 files then deleting and replacing the original. There surely must be an easier way?

I've tried using something like this, but its slow to load in, and I can't work out how to delete my tag nor modify the file.
Code:
    Set XDoc = CreateObject("MSXML2.DOMDocument")
    XDoc.async = False: XDoc.validateOnParse = False
    XDoc.Load ("C:\temp\xml.xml")

If anyone can give me any help it would be appreciated.</tagname></tagname></tagname>
 

Excel Facts

Why does 9 mean SUM in SUBTOTAL?
It is because Sum is the 9th alphabetically in Average, Count, CountA, Max, Min, Product, StDev.S, StDev.P, Sum, VAR.S, VAR.P.

Forum statistics

Threads
1,214,919
Messages
6,122,259
Members
449,075
Latest member
staticfluids

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top