Randomize a Column & Identify a Word

rwmill9716 · Aug 25, 2023

I have a set of letters in Column A. That Column has been randomized in Column B where the word 'Ric' from Cell C1 is found. I need a macro that will randomize Column A 1000 times and count how many times 'Ric' is spelled out.

bebo021999 · Aug 30, 2023

A2:A4993 consists of 192 sets of A-Z, sequentially arranged as each set goes from A to Z and then repeats (referred to as the "List").
I tested with n=100 iterations. Adjustable in code.
For each test iteration:

This "List" will shuffle in position, but all items remains unchanged (192 A's, 192 B's,..., 192 Z's).
Afterward, it looks for sequences of R, I, C, or C, I, R (a condition sequence of any length specified in cell F1).

After n test iterations, the results will be in the range E:H.
If you want to know the specific results of each test iteration, they can be found in the columns to the right, starting from column M.

Book2.xlsm

drive.google.com

VBA Code:

Option Explicit
Sub words()
Dim lr&, i&, j&, k&, r, rng, arr(), res(), word As String, n&, c&, t&, st As String, count&
Dim dic As Object
Set dic = CreateObject("Scripting.Dictionary")
word = Range("F1").Value ' word criteria
n = 100 ' running time
ReDim res(1 To n, 1 To 3)
lr = Cells(Rows.count, "A").End(xlUp).Row
rng = Range("A2:A" & lr).Value ' given list in col A
Range("F3:H100000").ClearContents
Range("N2:XX100000").ClearContents
Randomize
Do
    ReDim arr(1 To UBound(rng), 1 To 1)
    c = c + 1: k = 0: count = 0
    Do
        r = Int(Rnd * UBound(rng)) + 1
        If Not dic.exists(r) Then
            dic.Add r, ""
            k = k + 1: arr(k, 1) = rng(r, 1)
        End If
    Loop Until k >= UBound(rng)
    dic.RemoveAll
    For i = 1 To UBound(arr)
        st = ""
        If arr(i, 1) = Left(word, 1) Then
            On Error Resume Next
            For j = 0 To Len(word) - 1
                st = st & arr(i + j, 1)
            Next
            On Error GoTo 0
            If st = word Then count = count + 1
        End If
        If arr(i, 1) = Right(word, 1) Then
            On Error Resume Next
            For j = Len(word) - 1 To 0 Step -1
                st = st & arr(i - j, 1)
            Next
            On Error GoTo 0
            If st = word Then count = count + 1
        End If
        If st = word Then res(c, 1) = count: res(c, 2) = st: res(c, 3) = i
    Next
    If count > 0 Then
        t = t + 1
        With Range("XX3").End(xlToLeft).Offset(0, 1)
            .Offset(-2, 0).Value = t
            .Offset(-1, 0).Value = count
            .Resize(UBound(arr), 1).Value = arr
        End With
    End If
Loop Until c >= n
Range("F3").Resize(n, 3).Value = res
MsgBox "Finish! match found: " & t
End Sub

rwmill9716 · Aug 30, 2023

bebo021999 said:

A2:A4993 consists of 192 sets of A-Z, sequentially arranged as each set goes from A to Z and then repeats (referred to as the "List").
I tested with n=100 iterations. Adjustable in code.
For each test iteration:

This "List" will shuffle in position, but all items remains unchanged (192 A's, 192 B's,..., 192 Z's).
Afterward, it looks for sequences of R, I, C, or C, I, R (a condition sequence of any length specified in cell F1).

After n test iterations, the results will be in the range E:H.
If you want to know the specific results of each test iteration, they can be found in the columns to the right, starting from column M.

Book2.xlsm

drive.google.com

VBA Code:

Option Explicit
Sub words()
Dim lr&, i&, j&, k&, r, rng, arr(), res(), word As String, n&, c&, t&, st As String, count&
Dim dic As Object
Set dic = CreateObject("Scripting.Dictionary")
word = Range("F1").Value ' word criteria
n = 100 ' running time
ReDim res(1 To n, 1 To 3)
lr = Cells(Rows.count, "A").End(xlUp).Row
rng = Range("A2:A" & lr).Value ' given list in col A
Range("F3:H100000").ClearContents
Range("N2:XX100000").ClearContents
Randomize
Do
    ReDim arr(1 To UBound(rng), 1 To 1)
    c = c + 1: k = 0: count = 0
    Do
        r = Int(Rnd * UBound(rng)) + 1
        If Not dic.exists(r) Then
            dic.Add r, ""
            k = k + 1: arr(k, 1) = rng(r, 1)
        End If
    Loop Until k >= UBound(rng)
    dic.RemoveAll
    For i = 1 To UBound(arr)
        st = ""
        If arr(i, 1) = Left(word, 1) Then
            On Error Resume Next
            For j = 0 To Len(word) - 1
                st = st & arr(i + j, 1)
            Next
            On Error GoTo 0
            If st = word Then count = count + 1
        End If
        If arr(i, 1) = Right(word, 1) Then
            On Error Resume Next
            For j = Len(word) - 1 To 0 Step -1
                st = st & arr(i - j, 1)
            Next
            On Error GoTo 0
            If st = word Then count = count + 1
        End If
        If st = word Then res(c, 1) = count: res(c, 2) = st: res(c, 3) = i
    Next
    If count > 0 Then
        t = t + 1
        With Range("XX3").End(xlToLeft).Offset(0, 1)
            .Offset(-2, 0).Value = t
            .Offset(-1, 0).Value = count
            .Resize(UBound(arr), 1).Value = arr
        End With
    End If
Loop Until c >= n
Range("F3").Resize(n, 3).Value = res
MsgBox "Finish! match found: " & t
End Sub

View attachment 97964

BEBO,
Thanks for your excellent work here.

I have a few questions for clarification:

1. What does the count in Column F and Row 2 (Cols N to DI mean? Why is it always an even number? In column N, for instance, I see only one occurrence, yet the count is 2.
2. I assume that each column (M and beyond) is a randomization of Column A, one that contains a match? Could I increase the size of Column A? Say I wanted to double it; what changes would be necessary in the code?
3. In your test case above, exactly how many randomized columns were analyzed? As I read your example, would I have found "ALL" or "LLA" 18 times in 100 randomizations 5000 letters, Correct? That would be: the 3 letter word "ALL" or its inverse in a random sequence of 500,000 letters.
4. When I run the program, the spinning wheel continues after your message pops up the count of matches. If I hit "Ok" for that message, the spinning wheel stops. I assume there's no reason to just wait until it stops spinning.

rwmill9716 · Aug 30, 2023

rwmill9716 said:
BEBO,
Thanks for your excellent work here.

I have a few questions for clarification:

1. What does the count in Column F and Row 2 (Cols N to DI mean? Why is it always an even number? In column N, for instance, I see only one occurrence, yet the count is 2.
2. I assume that each column (M and beyond) is a randomization of Column A, one that contains a match? Could I increase the size of Column A? Say I wanted to double it; what changes would be necessary in the code?
3. In your test case above, exactly how many randomized columns were analyzed? As I read your example, would I have found "ALL" or "LLA" 18 times in 100 randomizations 5000 letters, Correct? That would be: the 3 letter word "ALL" or its inverse in a random sequence of 500,000 letters.
4. When I run the program, the spinning wheel continues after your message pops up the count of matches. If I hit "Ok" for that message, the spinning wheel stops. I assume there's no reason to just wait until it stops spinning.

I see that the count is counting both the word and its inverse, i.e., the pair will always be there.

bebo021999 · Aug 30, 2023

I've made some adjustments to the code, specifically in the word reversal part, and it's now working fine.

Column E represents the test number, ranging from 1 to 100.Column F counts the occurrences of either RIC or CIR in that specific test.Column G contains the found word. If there are two or more words, it belongs to the 1st found. Column H indicates the row number where the word was 1st found.

Starting from column N, there's a detailed breakdown of the random characters in each test where the word was discovered.

In the file, I'm currently testing for a random occurrence in test 11. It's found twice (E13:H13) on row 2486.The details of the character sets are in column O. Column M contains the string concatenation formula.

You can filter column M with two conditions: contains RIC or CIR, which will give you the two rows with RIC or CIR. (manually mark with "x" in column L)

Book2.xlsm

drive.google.com

bebo021999 · Aug 30, 2023

rwmill9716 said:
BEBO,
Thanks for your excellent work here.

I have a few questions for clarification:

Hello

1. What does the count in Column F and Row 2 (Cols N to DI mean? Why is it always an even number? In column N, for instance, I see only one occurrence, yet the count is 2.
Even number: Fixed in previous post.

2. I assume that each column (M and beyond) is a randomization of Column A, one that contains a match?
Yes.

- Could I increase the size of Column A? Say I wanted to double it; what changes would be necessary in the code?
Just copy/ paste/ or manual input more into column A
In the code, "lr" (Last row) to update last row col A automatically

3. In your test case above, exactly how many randomized columns were analyzed?
in example, would be 18 within 100 test

- As I read your example, would I have found "ALL" or "LLA" 18 times in 100 randomizations 5000 letters, Correct?
Correct. 192 A's, 192 B's,..., 192 A's
= 192*26=4992

That would be: the 3 letter word "ALL" or its inverse in a random sequence of 500,000 letters.
ALL or LLA, if found then count

4. When I run the program, the spinning wheel continues after your message pops up the count of matches. If I hit "Ok" for that message, the spinning wheel stops. I assume there's no reason to just wait until it stops spinning.
Jsu remove the msgbox in the last row in code script.

rwmill9716 · Sep 1, 2023

BEBO,

Thanks again, this new version works very well.

Could you adapt this program to find sequences that skip letters? For instance, R I C (C I R) where the skips could be from 0 to 5 spaces where the spaces between letters is the same. Each space would be filled with any letter including R, I, C. For example, using 5 skips, I might get: XHRRTXIPIOBNKCCP... The bold letters skip 5, meeting the search criteria, while the underlined letters have an unbalanced skip 2 between R & I and 6 between I & C. If you prefer, I could write this in a new question?

bebo021999 · Sep 3, 2023

Of course, I can assist you with this new request. However, I assume that you have understood how my code works. If you do, please go ahead and customize it for your new requirements, alright? I'll be ready to help you step by step.

rwmill9716 · Sep 5, 2023

bebo021999 said:
Of course, I can assist you with this new request. However, I assume that you have understood how my code works. If you do, please go ahead and customize it for your new requirements, alright? I'll be ready to help you step by step.

Bebo,
I've come across a problem with your macro: different letter sequences give me different results. In column A, you have 192 copies of our 26 letter alphabet which you randomize 100 times and look for a 3-letter sequence. Under those conditions, I would expect no differences among any 3-letter combinations. Here's a sample where I've run each letter sequence 5 times through your latest version.

RIC: 36, 36, 34, 35, 37
ABC: 23, 25, 23, 24,22
XYZ: 39, 40, 38, 38, 39
MNO: 42, 40, 39, 43, 40

I would like to adapt this program and am studying yours's to see if I can do it.
I would like to enter my word as: RIC, R I C, R I C, R I C, R I C, R I C, three letters with zero to 5 spaces which can be filled with any of the other letters, including the ones in the sequence that I'm looking for. Fo example, RRIZC would be RIC with a one space skip. One way I tried to do this was to change all the letters to X except the three used in the sequence that I'm looking for, but the problem is that the Xs wouldn't include other letters in the sequence: RXXXIXXXC for example would be a 3 letter skip but none of the Xs would include members from the 192 RIC letters.

Ric

bebo021999 · Sep 6, 2023

Try again to see if it works.

test.xlsm

drive.google.com

rwmill9716 · Sep 6, 2023

Thanks, Bebo, this works great. My next step will be to load a book letter by letter to replace the 192 copies of the alphabet, and then look for the frequency of occurrences of certain words in the book itself, as well as the randomized versions, which your program does. I assume that there will be no problem with changing the length of Col A (say 500 to 10000 letters). To accomplish the first (look for the word sequence in the nonrandomized column), what changes do I need to make to the code to eliminate the randomization. Note, this will be a 2nd program, and I don't care if the 100 columns are the same. I tried eliminating Randomize command in the code and that didn't work, also change the definition of r=INT(Rnd*... to r=INT(1... but neither worked.

Randomize a Column & Identify a Word

rwmill9716

Active Member

Attachments

bebo021999

Well-known Member

Book2.xlsm

Excel Facts

rwmill9716

Active Member

Book2.xlsm

rwmill9716

Active Member

bebo021999

Well-known Member

Book2.xlsm

bebo021999

Well-known Member

rwmill9716

Active Member

bebo021999

Well-known Member

rwmill9716

Active Member

bebo021999

Well-known Member

test.xlsm

rwmill9716

Active Member

Similar threads

Forum statistics

Share this page

Randomize a Column & Identify a Word

Active Member

Attachments

Well-known Member

Excel Facts

Active Member

Active Member

Well-known Member

Well-known Member

Active Member

Well-known Member

Active Member

Well-known Member

Active Member

Similar threads

Forum statistics

Share this page

We've detected that you are using an adblocker.

Which adblocker are you using?

Disable AdBlock

Disable AdBlock Plus

Disable uBlock Origin

Disable uBlock