Lookup or match data... not sure how to title this

Outdoorsman80

New Member
Joined
Oct 4, 2014
Messages
45
Office Version
  1. 365
Platform
  1. Windows
I have a spreadsheet that has combined exported data from multiple sources.

Columns:
MRN
PATIENT_NAMEMD USER_IDMD User Name
1234Doe, JaneUI123Fake, Doc
1283Smith, JakeUI172Real, Doc
1246Fake, NameUI123Fake, Doc MD
2983Fake, GirlUI822Smith, Doc
9836Fake, GuyUI098Doe, Doc

<tbody>
</tbody>

I need to be able to identify, highlight the discrepancy in the MD User Name column. As you can see, there are 2 listings for Fake, Doc- they are both the same people as indicated by the MD USER_ID but one row has the MD in it while the other doesn't.

How can I do this? I have a massive amount of rows and about 40% of the data has these types of discrepancies for many different doctors.

MRN belongs to the patient name.
 

Excel Facts

VLOOKUP to Left?
Use =VLOOKUP(A2,CHOOSE({1,2},$Z$1:$Z$99,$Y$1:$Y$99),2,False) to lookup Y values to left of Z values.

jtakw

Well-known Member
Joined
Jun 29, 2014
Messages
6,007
Office Version
  1. 2016
Platform
  1. Windows
Hi,

Not clear on How you want them identified, here are 2 ways, use Column E formula to indicate duplicated entries, use Column G formula as your CF formula to highlight the cell:

<b></b><table cellpadding="2.5px" rules="all" style=";background-color: rgb(255,255,255);border: 1px solid;border-collapse: collapse; border-color: rgb(187,187,187)"><colgroup><col width="25px" style="background-color: rgb(218,231,245)" /><col /><col /><col /><col /><col /><col /><col /></colgroup><thead><tr style=" background-color: rgb(218,231,245);text-align: center;color: rgb(22,17,32)"><th></th><th>A</th><th>B</th><th>C</th><th>D</th><th>E</th><th>F</th><th>G</th></tr></thead><tbody><tr ><td style="color: rgb(22,17,32);text-align: center;">1</td><td style=";">MRN</td><td style=";">PATIENT_NAME</td><td style=";">MD USER_ID</td><td style=";">MD User Name</td><td style=";">Alert</td><td style="text-align: right;;"></td><td style=";">CF formula for Column D</td></tr><tr ><td style="color: rgb(22,17,32);text-align: center;">2</td><td style="text-align: right;;">1234</td><td style=";">Doe, Jane</td><td style=";">UI123</td><td style="background-color: #FFFF00;;">Fake, Doc</td><td style=";">Duplicated</td><td style="text-align: right;;"></td><td style="text-align: right;;">TRUE</td></tr><tr ><td style="color: rgb(22,17,32);text-align: center;">3</td><td style="text-align: right;;">1283</td><td style=";">Smith, Jake</td><td style=";">UI172</td><td style=";">Real, Doc</td><td style=";"></td><td style="text-align: right;;"></td><td style="text-align: right;;">FALSE</td></tr><tr ><td style="color: rgb(22,17,32);text-align: center;">4</td><td style="text-align: right;;">1246</td><td style=";">Fake, Name</td><td style=";">UI123</td><td style="background-color: #FFFF00;;">Fake, Doc MD</td><td style=";">Duplicated</td><td style="text-align: right;;"></td><td style="text-align: right;;">TRUE</td></tr><tr ><td style="color: rgb(22,17,32);text-align: center;">5</td><td style="text-align: right;;">2983</td><td style=";">Fake, Girl</td><td style=";">UI822</td><td style=";">Smith, Doc</td><td style=";"></td><td style="text-align: right;;"></td><td style="text-align: right;;">FALSE</td></tr><tr ><td style="color: rgb(22,17,32);text-align: center;">6</td><td style="text-align: right;;">9836</td><td style=";">Fake, Guy</td><td style=";">UI098</td><td style=";">Doe, Doc</td><td style=";"></td><td style="text-align: right;;"></td><td style="text-align: right;;">FALSE</td></tr></tbody></table><p style="width:5.6em;font-weight:bold;margin:0;padding:0.2em 0.6em 0.2em 0.5em;border: 1px solid rgb(187,187,187);border-top:none;text-align: center;background-color: rgb(218,231,245);color: rgb(22,17,32)">Sheet13</p><br /><br /><table width="85%" cellpadding="2.5px" rules="all" style=";border: 2px solid black;border-collapse:collapse;padding: 0.4em;background-color: rgb(255,255,255)" ><tr><td style="padding:6px" ><b>Worksheet Formulas</b><table cellpadding="2.5px" width="100%" rules="all" style="border: 1px solid;text-align:center;background-color: rgb(255,255,255);border-collapse: collapse; border-color: rgb(187,187,187)"><thead><tr style=" background-color: rgb(218,231,245);color: rgb(22,17,32)"><th width="10px">Cell</th><th style="text-align:left;padding-left:5px;">Formula</th></tr></thead><tbody><tr><th width="10px" style=" background-color: rgb(218,231,245);color: rgb(22,17,32)">E2</th><td style="text-align:left">=IF(<font color="Blue">COUNTIF(<font color="Red">C$2:C$6,C2</font>)>1,"Duplicated",""</font>)</td></tr><tr><th width="10px" style=" background-color: rgb(218,231,245);color: rgb(22,17,32)">G2</th><td style="text-align:left">=COUNTIF(<font color="Blue">C$2:C$6,C2</font>)>1</td></tr></tbody></table></td></tr></table><br />
 

lrobbo314

Well-known Member
Joined
Jul 14, 2008
Messages
3,344
Office Version
  1. 365
Platform
  1. Windows
If all you want to do is look at the differences, I would create a pivot table. Add the MD User_ID to rows, and then add the MD User_Name to Rows.

Having said that the code below can do 2 things. It can highlight the differences. That is to say, the "MD" part of "Fake, Doc MD" would be highlighted as red text. The code finds the User_Name with the fewest number of characters and highlights the difference in all of the cells with longer lengths. Not sure if the shortest length User_Name is the one you would necessarily want, but that's what it does.

But, I figured that you would rather just change the User_Name to be equal to the shortest User_Name. So, if you run the code the way it is, it will do this.

It will look at each User_ID. Find the shortest User_Name associated to that ID. Then, change all other User_Names associated to that ID to the shortest User_Name. So, for UI123, all MD_UserNames would be 'Fake, Doc'.

This code assumes that your data is in columns A:D, starting in A1. If your data is somewhere else, you will need to adjust the code.

You will also need to add a reference to 'Microsoft Scripting Runtime' for the code to work.

Code:
Sub MedII()
Dim AR()
Dim R As Range
Dim LR As Long
Dim Dict As New Dictionary
Dim Current As String
Dim i As Integer
Dim Lowest As Integer


LR = Range("A" & Rows.Count).End(xlUp).Row()
Set R = Range("C2:D" & LR)
AR = R.Value


For i = 1 To UBound(AR)
    Current = AR(i, 1)
    If Not Dict.Exists(Current) Then
        Dict.Add Current, Current
        Lowest = FindLowest(AR, i)
        HighlightDif R.Columns(1), CStr(AR(Lowest, 1)), CStr(AR(Lowest, 2))
    End If
Next i


End Sub


Function FindLowest(ByRef V As Variant, Pos As Integer) As Integer
Dim Low As Integer


Low = Pos


For i = Pos + 1 To UBound(V)
    If V(i, 1) = V(Pos, 1) And Len(V(i, 2)) < Len(V(Pos, 2)) Then
        Low = i
    End If
Next i


FindLowest = Low
End Function


Sub HighlightDif(R As Range, ID As String, uName As String)
Dim cel As Range
Dim Off As Range


For Each cel In Range(R.Address)
    If cel.Value = ID Then
    Set Off = cel.Offset(, 1)
        If Off.Value <> uName Then
'            For i = Len(uName) + 1 To Len(Off.Value) 'This commented block of code will change the font color of the differences to Red
'                Off.Characters(i, 1).Font.Color = vbRed
'            Next i
            Off.Value = uName
        End If
    End If
Next cel
End Sub
 
Last edited:

Outdoorsman80

New Member
Joined
Oct 4, 2014
Messages
45
Office Version
  1. 365
Platform
  1. Windows
Hi,

Not clear on How you want them identified, here are 2 ways, use Column E formula to indicate duplicated entries, use Column G formula as your CF formula to highlight the cell:

ABCDEFG
1MRNPATIENT_NAMEMD USER_IDMD User NameAlertCF formula for Column D
21234Doe, JaneUI123Fake, DocDuplicatedTRUE
31283Smith, JakeUI172Real, DocFALSE
41246Fake, NameUI123Fake, Doc MDDuplicatedTRUE
52983Fake, GirlUI822Smith, DocFALSE
69836Fake, GuyUI098Doe, DocFALSE

<tbody>
</tbody>
Sheet13

Worksheet Formulas
CellFormula
E2=IF(COUNTIF(C$2:C$6,C2)>1,"Duplicated","")
G2=COUNTIF(C$2:C$6,C2)>1

<tbody>
</tbody>

<tbody>
</tbody>


Hi, can you help me understand the formula a bit? If I have say 8000 rows, would I just change C$6 to C$8000 ?


lrobbo314 - I will try that in a bit
 
Last edited:

jtakw

Well-known Member
Joined
Jun 29, 2014
Messages
6,007
Office Version
  1. 2016
Platform
  1. Windows
Hi, can you help me understand the formula a bit? If I have say 8000 rows, would I just change C$6 to C$8000 ?lrobbo314 - I will try that in a bit
Yes, just change the cell range to match your data, you can even go over a bit in case the table may grow in size.
 
Solution

Forum statistics

Threads
1,148,294
Messages
5,745,942
Members
423,985
Latest member
sayed manzar

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Top