Pick pairs from an array

Jaymond Flurrie

Well-known Member
Joined
Sep 22, 2008
Messages
868
I'm once again trying to solve an array issue. This time I'm not even really sure do I need to solve it.

So, I have an array where I save the opened tags. Let's make this simple and say I just have string, ASDFASDF. When I read it from beginning to end, I place the read character in an array vOpened. Once I find a match (the same character again), another array, vPaired finds the one it matches, copies both locations and the character, and removes those both from vOpened.

So, let's see how it does it
First comes A, no match found from opened tags, so it goes to vOpened to spot 0
Then comes S, no match found from opened tags, so it goes to vOpened to spot 1
Then comes D, no match found from opened tags, so it goes to vOpened to spot 2
Then comes F, no match found from opened tags, so it goes to vOpened to spot 3
Then comes A, a match found from vOpened on spot 0, vPaired gets values 0, 4 (since this was the fifth character) and A

Then we get to my actual problem. Should I now somehow remove the rows 0 and 4? And if so, then how? Just looping thru vOpened and if a spot is not empty, then vTemp (or whatever) gets value in that not empty spot to it's next open spot? Then loop thru vTemp and when an empty is found, copy all earlier spots back to vOpened?

Did I count right, I need four loops for this all?

The reason why I said "I'm not even really sure do I need to solve it." is that if an array that has million spots with 999999 empty elements and the last one has one character there, does the array require a lot of memory?
 

Some videos you may like

Excel Facts

Excel Can Read to You
Customize Quick Access Toolbar. From All Commands, add Speak Cells or Speak Cells on Enter to QAT. Select cells. Press Speak Cells.
L

Legacy 68668

Guest
Don't quite understand what you actually trying to do though

Code:
Sub test()
Dim myStr, i As Long, vTemp
myStr = "ASDFASDF"
With CreateObject("Scripting.Dictionary")
    .compareMode = vbtextCompare
    For i = 1 To Len(myStr)
        If Not .exists(Mid(myStr,i, 1)) Then
            .add mid(myStr, i, 1), Nothing
        Else
            .remove mid(myStr, i, 1)
        End If
    Next
    If .count > 0 Then
        vTemp = .keys
    Else
        MsgBox "No letter left"
    End If
End with
End Sub
 

Jaymond Flurrie

Well-known Member
Joined
Sep 22, 2008
Messages
868
Don't quite understand what you actually trying to do though

Code:
Sub test()
Dim myStr, i As Long, vTemp
myStr = "ASDFASDF"
With CreateObject("Scripting.Dictionary")
    .compareMode = vbtextCompare
    For i = 1 To Len(myStr)
        If Not .exists(Mid(myStr,i, 1)) Then
            .add mid(myStr, i, 1), Nothing
        Else
            .remove mid(myStr, i, 1)
        End If
    Next
    If .count > 0 Then
        vTemp = .keys
    Else
        MsgBox "No letter left"
    End If
End with
End Sub

Ok, let's see. Anyway, thank you for the answer!

edit. Why mystr is not a string?

edit2. What does this return? I mean, the idea is to get that vPaired as a result.
 
Last edited:

Jaymond Flurrie

Well-known Member
Joined
Sep 22, 2008
Messages
868
I think what Seiya posted is getting very close.

So the idea is that since I'm doing a parser as my thesis, we could try this with an overly simple "web page"
html
body
some text here
/body
/html

where html is an opening tag, /html is the matching ending tag, and the content is not wanted as the result

body is an opening tag, /body is a matching ending tag, and the content is wanted as the result. body has to reside inside of html - /html pair (ie. first has to come html, then body, regardless of whether there's later /html)

Now my parser tries to find html, it finds it. Then it tries to find body, it finds it. Then it saves a bit altered opening position of body (this is what the original question is about), then it tries to find the /body tag. It finds it, and saves the content, which is "some text here", and so on. The final result is ""some text here" at position XXX".

This is what I asked in this thread is extremely simple to do by just using worksheet and it's functions, but I thought to eliminate everything this has to do with Excel, so that this would be way more portable - There comes the need for arrays.

So the whole question here is that what's the easiest way to save the content. I think this library that seiya suggested is a very good option and I think I do it by that, I just tried to explain a bit that where I use it.
 
L

Legacy 68668

Guest

ADVERTISEMENT

Can you show me a sample html tag with the result ?

Dummy sample is OK, but the structure shoud be realistic, otherwise you will be in trouble to adjust the code.
 

Jaymond Flurrie

Well-known Member
Joined
Sep 22, 2008
Messages
868
Can you show me a sample html tag with the result ?

Dummy sample is OK, but the structure shoud be realistic, otherwise you will be in trouble to adjust the code.

Well if source is

"html body some /body /html html s /html"
("h" being the first character and "l" being the last character)
(whether it's html or not is totally irrelevant to my thesis, binary code is fine), and we assume that we want to save both html and body tag pairs, then the vOpened and vResult processes like this goes like this:

html opens at position 4, so the result at that point is
vOpened(0,0)=5 (because first mark of that content is opening tag end + 1)

(So there's the first found in vOpened)

Then it finds body at position 9, so the result at that point is
vOpened(0,0)=5
vOpened(1,0)=10

Then it finds /body at position 20, so because it matches with the body found already in position 9, the result is
vOpened(0,0)=5

(Question: How many rows should vOpened have? One? Two? If two, then the later is clearly empty, but is there any reason to use code to remove it from array?)

and

vPaired(0,0)=10
vPaired(0,1)=15 (the position where /body was found - length of it)
vPaired(0,2)=" some "(note the two spaces, one trailing, one preceeding the string)


Then it finds /html at position 26, so because it matches with the html found already in position 4, the result is
(vopened is empty)

vPaired(0,0)=10
vPaired(0,1)=15
vPaired(0,2)=" some "
vPaired(1,0)=5
vPaired(1,1)=21
vPaired(1,2)=" body some /body "

html opens at position 31, so the result at that point is
vOpened(2,0) = 32
(Question: Should this now be at vOpened(0,0)?)

and

vPaired(0,0)=10
vPaired(0,1)=15
vPaired(0,2)=" some "
vPaired(1,0)=5
vPaired(1,1)=21
vPaired(1,2)=" body some /body "

Then it finds /html at position 39, so the result at that point is
(vOpened is empty)
vPaired(0,0)=10
vPaired(0,1)=15
vPaired(0,2)=" some "
vPaired(1,0)=5
vPaired(1,1)=21
vPaired(1,2)=" body some /body "
vPaired(1,0)=32
vPaired(1,1)=34
vPaired(2,2)=" s "

which is also the final result, 3x3 sized array.

So I can easily do this finding pair and saving to vPaired, but using that vOpened is the problem. Should I remove the element from vOpened when the element is empty (like after finding that /body) or should I just keep adding those elements there and end up having a million empty elements before actually having any data there? That library sounds like a nice thing, but maybe it is too powerful for this, since in ideal situation (like my example there) the library ends up being empty.
 
Last edited:
L

Legacy 68668

Guest

ADVERTISEMENT

Do you just want to extract Body ?
Code:
Sub test()
Dim myStr As String
myStr = "html body some /body /html html s /html"
With CreateObject("VBScript.RegExp")
    .Pattern = "body(.+)/body"
    If .test(myStr) Then
        MsgBox "Body is" Y vbLf & .execute(myStr)(0).submatches(0)
    Else
        MsgBox "Body is not found"
    End If
End With
End Sub
 

Jaymond Flurrie

Well-known Member
Joined
Sep 22, 2008
Messages
868
Do you just want to extract Body ?
Code:
Sub test()
Dim myStr As String
myStr = "html body some /body /html html s /html"
With CreateObject("VBScript.RegExp")
    .Pattern = "body(.+)/body"
    If .test(myStr) Then
        MsgBox "Body is" Y vbLf & .execute(myStr)(0).submatches(0)
    Else
        MsgBox "Body is not found"
    End If
End With
End Sub

Basically yes, but the whole idea is that anything has to go, and I already have code that does this everything, the only thing I miss is that instead of having worksheet functions and copying those found matches (for example, couple of hundred table cells) to a new workbook, I would actually get an array that I could then put to anywhere without using Excel worksheets or worksheet functions.

This RegExp thing otherwise might be even better than what I already have, but it doesn't work with comments and strings (I mean, when a part of the source string is defined to be treated exactly as a string or as comments - don't search from there, skip those parts, resume to tag tree etc.). What you posted looks like it just says "is there and if then where?" instead of actually returning an array. I'm more looking for an answer that how those arrays are used effectively, are empty elements a bad thing and if, then is having four loops a better solution, not necessarily a code.
 
Last edited:
L

Legacy 68668

Guest
If the html tag, <....>......</....>, shouldn't be the case ?

It is too hard when it haven't got those.
 

Jaymond Flurrie

Well-known Member
Joined
Sep 22, 2008
Messages
868
If the html tag, <....>......<!--....-->, shouldn't be the case ?

It is too hard when it haven't got those.

Another problem there is that the number of tags it looks at once can be any positive amount, like first it looks for HTML and XML tags, then if XML is found, it looks for BOOK tag or if HTML is found, it looks for /* and BODY and IMG and HEAD tags etc. That was kind of tricky to do, but like said, it's all done. I just would need to know which one is more efficient, using four loops or having a lot of empty elements in an array.
 

Watch MrExcel Video

Forum statistics

Threads
1,122,214
Messages
5,594,882
Members
413,947
Latest member
gizmolucy

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Top