Find partial string match

JayG30

New Member
Joined
Sep 2, 2014
Messages
10
Hello,

I've read a bit already on this but what is posted doesn't seem to meet my needs.

I have a large table of data that contains a column with an exact Organism specification (genus and species). Some rows even have multiple organisms delimited by a semicolon (;). The data is pulled from a oracle DB. There is also another column that I am going to have to calculate called "Gram +/-". The table looks like this:

IDOrganismGram +/-
1Staphylococcus capitis
2Bacillus species;Staphylococcus capitis
3Staphylococcus aureus;Enterococcus faecalis;Pseudomonas aeruginosa

<tbody>
</tbody>

The "Gram +/-" is determined by the "Genus" of the organism. So in the Organism the genus part would be the first word (ie. Staphylococcus, Bacillus, Enterococcus, Pseudomonas). The relationship between these and Gram +/Gram - would just be stored in a lookup table is what I was thinking. Like this.

GenusGram +/-
Staphylococcus+
Bacillus+
Enterococcus-
Pseudomonas-

<tbody>
</tbody>

However, I can't seem to determine a way to match "Staphylococcus capitis" (and the others) against a lookup table like the following for "partial string matches". And I also need it to be able to match multiple times for those that have multiple organisms. The table would ideally end up like this;

IDOrganismGram +/-
1Staphylococcus capitis+
2Bacillus species;Staphylococcus capitis+;+
3Staphylococcus aureus;Enterococcus faecalis;Pseudomonas aeruginosa+;-;-

<tbody>
</tbody>

I have a feeling this just might not work. The best I've come up with is;
Code:
=IF(ISERROR(LOOKUP(2^15,SEARCH(lookup_gram[Genus],[@Organism]),lookup_gram[Genus])),"",LOOKUP(2^15,SEARCH(lookup_gram[Genus],[@Organism]),lookup_gram[Gram]))
but that won't find partial matches when there are more then one organism.
 

Excel Facts

Can Excel fill bagel flavors?
You can teach Excel a new custom list. Type the list in cells, File, Options, Advanced, Edit Custom Lists, Import, OK
You could do a text to column data application on your organism column then have something like:
=VLOOKUP(LEFT(B2,FIND(" ",B2)-1),'Gram Ref'!$a$2:$a$5,2,FALSE)
And then you can concatenate all of the plus/minuses with a ; symbol.
 
Upvote 0
I could not think of a good way to do what you asked for using formulas (mainly because of the indeterminate number of semi-colon delimited organisms that can be in a cell), so I wrote a UDF (user defined function for you)...
Code:
Function GetGrams(Organisms As String, LookUpTable As Range) As String
  Dim X As Long, Z As Long, Orgs() As String
  Static LookupData As Variant
  If IsEmpty(LookupData) Then LookupData = LookUpTable
  Orgs = Split(Organisms, ";")
  For X = 0 To UBound(Orgs)
    For Z = 1 To UBound(LookupData)
      If Split(Orgs(X))(0) = LookupData(Z, 1) Then
        GetGrams = GetGrams & ";" & LookupData(Z, 2)
        Exit For
      End If
    Next
  Next
  GetGrams = Mid(GetGrams, 2)
End Function

HOW TO INSTALL UDFs
------------------------------------
If you are new to UDFs, they are easy to install and use. To install it, simply press ALT+F11 to go into the VB editor and, once there, click Insert/Module on its menu bar, then copy/paste the above code into the code window that just opened up. That's it.... you are done. You can now use GetGrams just like it was a built-in Excel function. For example,

=GetGrams(B2,Sheet2!A1:B5)

where the first argument (B2) is cell reference to the Organism cell you want to get the Gram symbol(s) for and the second argument is a range reference to the cells making up your "Gram+/-" table.

If you are using XL2007 or above, make sure you save your file as an "Excel Macro-Enabled Workbook (*.xlsm) and answer the "do you want to enable macros" question as "yes" or "OK" (depending on the button label for your version of Excel) the next time you open your workbook.
 
Upvote 0
I could not think of a good way to do what you asked for using formulas (mainly because of the indeterminate number of semi-colon delimited organisms that can be in a cell), so I wrote a UDF (user defined function for you)...
Code:
Function GetGrams(Organisms As String, LookUpTable As Range) As String
  Dim X As Long, Z As Long, Orgs() As String
  Static LookupData As Variant
  If IsEmpty(LookupData) Then LookupData = LookUpTable
  Orgs = Split(Organisms, ";")
  For X = 0 To UBound(Orgs)
    For Z = 1 To UBound(LookupData)
      If Split(Orgs(X))(0) = LookupData(Z, 1) Then
        GetGrams = GetGrams & ";" & LookupData(Z, 2)
        Exit For
      End If
    Next
  Next
  GetGrams = Mid(GetGrams, 2)
End Function

HOW TO INSTALL UDFs
------------------------------------
If you are new to UDFs, they are easy to install and use. To install it, simply press ALT+F11 to go into the VB editor and, once there, click Insert/Module on its menu bar, then copy/paste the above code into the code window that just opened up. That's it.... you are done. You can now use GetGrams just like it was a built-in Excel function. For example,

=GetGrams(B2,Sheet2!A1:B5)

where the first argument (B2) is cell reference to the Organism cell you want to get the Gram symbol(s) for and the second argument is a range reference to the cells making up your "Gram+/-" table.

If you are using XL2007 or above, make sure you save your file as an "Excel Macro-Enabled Workbook (*.xlsm) and answer the "do you want to enable macros" question as "yes" or "OK" (depending on the button label for your version of Excel) the next time you open your workbook.

Thanks. I'll have to take a look at this more closely when I get a minute.
I've done these before, but will take a look at the code you presented.
Really appreciate it!
 
Upvote 0
I could not think of a good way to do what you asked for using formulas (mainly because of the indeterminate number of semi-colon delimited organisms that can be in a cell), so I wrote a UDF (user defined function for you)...
Code:
Function GetGrams(Organisms As String, LookUpTable As Range) As String
  Dim X As Long, Z As Long, Orgs() As String
  Static LookupData As Variant
  If IsEmpty(LookupData) Then LookupData = LookUpTable
  Orgs = Split(Organisms, ";")
  For X = 0 To UBound(Orgs)
    For Z = 1 To UBound(LookupData)
      If Split(Orgs(X))(0) = LookupData(Z, 1) Then
        GetGrams = GetGrams & ";" & LookupData(Z, 2)
        Exit For
      End If
    Next
  Next
  GetGrams = Mid(GetGrams, 2)
End Function

HOW TO INSTALL UDFs
------------------------------------
If you are new to UDFs, they are easy to install and use. To install it, simply press ALT+F11 to go into the VB editor and, once there, click Insert/Module on its menu bar, then copy/paste the above code into the code window that just opened up. That's it.... you are done. You can now use GetGrams just like it was a built-in Excel function. For example,

=GetGrams(B2,Sheet2!A1:B5)

where the first argument (B2) is cell reference to the Organism cell you want to get the Gram symbol(s) for and the second argument is a range reference to the cells making up your "Gram+/-" table.

If you are using XL2007 or above, make sure you save your file as an "Excel Macro-Enabled Workbook (*.xlsm) and answer the "do you want to enable macros" question as "yes" or "OK" (depending on the button label for your version of Excel) the next time you open your workbook.

This works awesome! Really appreciate it!

I noticed that if the organism column is pulled in with a line feed from the Oracle database
Code:
LISTAGG(r.ORGANISM, ';'||chr(10))
that it didn't work for those cells with more then one organism. So I looked at the code and realized all I had to do is modify the one line to;
Code:
Orgs = Split(Organisms, ";" & Chr(10))

Thanks so much for this!
 
Upvote 0
I noticed that if the organism column is pulled in with a line feed from the Oracle database
Code:
LISTAGG(r.ORGANISM, ';'||chr(10))
that it didn't work for those cells with more then one organism. So I looked at the code and realized all I had to do is modify the one line to;
Code:
Orgs = Split(Organisms, ";" & Chr(10))
But wouldn't that screw up the results for the cell text where there was no Line Feed? If you do the change this way, it should work whether there is a Line Feed in the text or not (vbLf is a built-in VB constant for the Line Feed)...

Orgs = Split(Replace(Organisms, vbLf, ""), ";")
 
Upvote 0
I don't think it would really matter because the results coming from the query either have a line feed or don't, there wouldn't be a mixture of the two (the sql query aggregates the data and sets the delimiter). It's all or nothing the way that the SQL query works. However, that's for the info on the built-in VB constant. I've actually removed the line feed anyway now.

One thing I noticed is that it doesn't seem like the data updates until I close/reopen the workbook. For instance if I go into the lookup table and change a value from "+" to "-" it doesn't update where the GetGram formula is until I close/reopen the workbook. Its not a big deal but just curious why.
 
Upvote 0
I don't think it would really matter because the results coming from the query either have a line feed or don't, there wouldn't be a mixture of the two (the sql query aggregates the data and sets the delimiter). It's all or nothing the way that the SQL query works. However, that's for the info on the built-in VB constant. I've actually removed the line feed anyway now.
Just to reiterate, the change I offered in Message #6 will let the code work whether there is a Line Feed character in the text or not, so that function would work for either type of text value.


One thing I noticed is that it doesn't seem like the data updates until I close/reopen the workbook. For instance if I go into the lookup table and change a value from "+" to "-" it doesn't update where the GetGram formula is until I close/reopen the workbook. Its not a big deal but just curious why.
That is because I made the LookupData variable Static, which I did to make the UDF more efficient, but that makes its value "stick" for as long as the workbook is open. The reason I did that is, normally, a lookup table doesn't change after its been opened for use. Here is the code with the Static (and its complementary coding) removed so that you can update the table whenever you want (I have also incorporated the change I suggested in Message #6 as well)...
Code:
Function GetGrams(Organisms As String, LookUpTable As Range) As String
  Dim X As Long, Z As Long, Orgs() As String, LookupData As Variant
  LookupData = LookUpTable
  Orgs = Split(Replace(Organisms, vbLf, ""), ";")
  For X = 0 To UBound(Orgs)
    For Z = 1 To UBound(LookupData)
      If Split(Orgs(X))(0) = LookupData(Z, 1) Then
        GetGrams = GetGrams & ";" & LookupData(Z, 2)
        Exit For
      End If
    Next
  Next
  GetGrams = Mid(GetGrams, 2)
End Function
 
Upvote 0
Thanks for the explanation, much appreciated. I've learned a lot from this.
I will probably end up leaving it static because as you said it shouldn't change once entered into the lookup table. I initially thought something might have been wrong because I was entered value into the lookup table and they weren't showing up. Before I posted I reopened the workbook and they were there.
Once again, thanks!
 
Upvote 0

Forum statistics

Threads
1,221,525
Messages
6,160,327
Members
451,637
Latest member
hvp2262

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top