Trying to Vlookup Duplicates

Pisky

New Member
Joined
Jun 14, 2022
Messages
12
Office Version
  1. 2013
Platform
  1. Windows
I have a dataset roughly 700,000 rows - So a large data set.

Basically this data set has a column for article number, description and price - This is for clothing products

Each article number is 10 digits long with the first 7 being unique to the type of product and the last 3 determining the size.

For example 6112167001 would be size small 6112167002 would be medium and so on.

I want to be able to be able to type in one of the article numbers, say 6112167001 and it pull each variation to the "6112167" number with each size - 001,002,003 and then vlookup the description from that.

I have tried adding a unique identifier but with 700,000 rows it doesn't load, always crashes and even if I manage to load 100,000 rows of it, it takes forever to even save.

Is there another way to find this?

I have added one of the products in a spreadsheet to help.

Product Example.xlsx
ABCDEFGHI
1Raw DataPulled data
2Article NumberDescriptionStruggling to turn this into this.
36112167001TARN BSW513 INDIGO STRIPE LS Indigo SArticle Number 6112167001=6112167001TARN BSW513 INDIGO STRIPE LS Indigo S
46112167002TARN BSW513 INDIGO STRIPE LS Indigo M6112167002TARN BSW513 INDIGO STRIPE LS Indigo M
56112167003TARN BSW513 INDIGO STRIPE LS Indigo L6112167003TARN BSW513 INDIGO STRIPE LS Indigo L
66112167004TARN BSW513 INDIGO STRIPE LS Indigo XL6112167004TARN BSW513 INDIGO STRIPE LS Indigo XL
76112167005TARN BSW513 INDIGO STRIPE L Indigo 2XL6112167005TARN BSW513 INDIGO STRIPE L Indigo 2XL
Sheet1
Cell Formulas
RangeFormula
I3:I7I3=VLOOKUP(H3,A:B,2,FALSE)
 

Excel Facts

How to find 2nd largest value in a column?
MAX finds the largest value. =LARGE(A:A,2) will find the second largest. =SMALL(A:A,3) will find the third smallest
Welcome to the MrExcel board!

With such large data and Excel 2013 I suspect you might have performance issues with any formula approach.
If I have understood correctly then you can get the article numbers with a formula like this.

22 06 14.xlsm
ABEFGH
2Article NumberDescription
36112167001TARN BSW513 INDIGO STRIPE LS Indigo SArticle Number 6112167001=6112167001
46112167002TARN BSW513 INDIGO STRIPE LS Indigo M6112167002
56112167003TARN BSW513 INDIGO STRIPE LS Indigo L6112167003
66112167004TARN BSW513 INDIGO STRIPE LS Indigo XL6112167004
76112167005TARN BSW513 INDIGO STRIPE L Indigo 2XL6112167005
8 
9 
Pisky
Cell Formulas
RangeFormula
H3:H9H3=IFERROR(INDEX(A:A,AGGREGATE(15,6,ROW(A$3:A$800000)/(LEFT(A$3:A$800000,7)=LEFT(F$3,7)),ROWS(H$3:H3))),"")


If performance is too painful then would you consider a macro approach?
 
Upvote 0
Due to the size of your data, and what you are trying to accomplish, this would really work much better in a relational database program like Microsoft Access, or SQL (what you are describing is actually a "relational database"). This could be handled without too much problem using queries in Access (or SQL).

Note, if you are stuck using Excel, you may want to investigate using "Power Query", which allows you to do database-type operations in Excel.
 
Upvote 0
A few things, I have 365 at home - Where I am currently. My work unfortunately uses 2013 (No clue why but whatever)

Also they block 3rd party installations like Power Query - Already tried that *Sigh*

I have access to Microsoft Access (2013 again I believe) but I'm not familiar with it nor know how to use it.

The main reason I need to be able to pull this data as in my job I'm required to check clothing lines that have had one product stolen in that line and we are unable to determine which size hence we check each product in that size and determine it that way - Stock accuracy we love to call it lol. So this will allow me to punch in a bunch of codes and have that generate a list we need to check.

That formula mention works, but the same issues I had earlier with needing to check more than one clothing item it turns into the attached. I've been trying to wrap my head around a work around. I know I could always make it a table and filter blanks but I'm too lazy to do that everytime I need to check clothing lines. Plus not every clothing like has 5 different sizes. Some have up to 18 or so different sizes.

The workbook currently also allows me to use it to generate data and receipts for our crime database so this was just an add in feature I wanted to include.

Cell Formulas
RangeFormula
B1:B5B1=IFERROR(INDEX('Master Data'!A:A,AGGREGATE(15,6,ROW('Master Data'!A$3:A$800000)/(LEFT('Master Data'!A$3:A$800000,7)=LEFT(A$1,7)),ROWS(C$1:C1))),"")
C1:C10C1=VLOOKUP(B1,'Master Data'!A:E,2,FALSE)
B6:B10B6=IFERROR(INDEX('Master Data'!A:A,AGGREGATE(15,6,ROW('Master Data'!A$1:A$800000)/(LEFT('Master Data'!A$1:A$800000,7)=LEFT(A$6,7)),ROWS(C$6:C6))),"")
 
Upvote 0
Welcome to the MrExcel board!

With such large data and Excel 2013 I suspect you might have performance issues with any formula approach.
If I have understood correctly then you can get the article numbers with a formula like this.

22 06 14.xlsm
ABEFGH
2Article NumberDescription
36112167001TARN BSW513 INDIGO STRIPE LS Indigo SArticle Number 6112167001=6112167001
46112167002TARN BSW513 INDIGO STRIPE LS Indigo M6112167002
56112167003TARN BSW513 INDIGO STRIPE LS Indigo L6112167003
66112167004TARN BSW513 INDIGO STRIPE LS Indigo XL6112167004
76112167005TARN BSW513 INDIGO STRIPE L Indigo 2XL6112167005
8 
9 
Pisky
Cell Formulas
RangeFormula
H3:H9H3=IFERROR(INDEX(A:A,AGGREGATE(15,6,ROW(A$3:A$800000)/(LEFT(A$3:A$800000,7)=LEFT(F$3,7)),ROWS(H$3:H3))),"")


If performance is too painful then would you consider a macro approach?
If there was a Macro Approach you would recommend... Shoot. I'll give it a shot and see if it works. I'm unfamiliar with Macro and VBA codes etc.
 
Upvote 0
See if this could be any use. To implement (test with a copy of your workbook) ..
1. With your workbook active press Alt+F11 to bring up the vba window.
2. In the Visual Basic window use the menu to Insert|Module
3. Copy and Paste the code below into the main right hand pane that opens at step 2.
4. Close the Visual Basic window.
5. Press Alt+F8 to bring up the Macro dialog
6. Select the macro & click ‘Run’
7. Your workbook will need to be saved as a macro-enabled workbook (*.xlsm)

VBA Code:
Sub GetArticles()
  Dim d As Object
  Dim a As Variant, b As Variant
  Dim i As Long, k As Long
  
  Set d = CreateObject("Scripting.Dictionary")
  With Sheets("Master Data")
    a = .Range("A2", .Range("B" & Rows.Count).End(xlUp)).Value
  End With
  With Sheets("Sheet1")
    b = .Range("A2", .Range("A" & Rows.Count).End(xlUp)).Value
    For i = 1 To UBound(b)
      d(Left(b(i, 1), 7)) = 1
    Next i
    ReDim b(1 To UBound(a), 1 To 2)
    For i = 1 To UBound(a)
      If d.exists(Left(a(i, 1), 7)) Then
        k = k + 1
        b(k, 1) = a(i, 1)
        b(k, 2) = a(i, 2)
      End If
    Next i
    Application.ScreenUpdating = False
    With .Range("B2:C2").Resize(k)
      .Value = b
      .Sort Key1:=.Columns(1), Order1:=xlAscending, Header:=xlNo
      .Rows(0).Value = Array("Article Number", "Description")
      .EntireColumn.AutoFit
    End With
    Application.ScreenUpdating = True
  End With
End Sub

This was my sample 'Master Data'

Pisky.xlsm
AB
1Article NumberDescription
26112167005TARN BSW513 INDIGO STRIPE L Indigo 2XL
36112166004TARN MYS604 BLK HOME OF THE J Black XL
46112166002TARN MYS604 BLK HOME OF THE JA Black M
56112167004TARN BSW513 INDIGO STRIPE LS Indigo XL
66112166001TARN MYS604 BLK HOME OF THE JA Black S
76112166003TARN MYS604 BLK HOME OF THE JA Black L
86112167003TARN BSW513 INDIGO STRIPE LS Indigo L
96112167002TARN BSW513 INDIGO STRIPE LS Indigo M
106666666666Data 1
116112167001TARN BSW513 INDIGO STRIPE LS Indigo S
125555555555Data 7
136112166005TARN MYS604 BLK HOME OF THE Black 2XL
Master Data


My 'Sheet1' just had what you see in column A before the code was run and the code produced what you see in columns B:C

Pisky.xlsm
ABC
1ArticleArticle NumberDescription
261121660045555555555Data 7
355555555556112166001TARN MYS604 BLK HOME OF THE JA Black S
461121670016112166002TARN MYS604 BLK HOME OF THE JA Black M
588888888886112166003TARN MYS604 BLK HOME OF THE JA Black L
66112166004TARN MYS604 BLK HOME OF THE J Black XL
76112166005TARN MYS604 BLK HOME OF THE Black 2XL
86112167001TARN BSW513 INDIGO STRIPE LS Indigo S
96112167002TARN BSW513 INDIGO STRIPE LS Indigo M
106112167003TARN BSW513 INDIGO STRIPE LS Indigo L
116112167004TARN BSW513 INDIGO STRIPE LS Indigo XL
126112167005TARN BSW513 INDIGO STRIPE L Indigo 2XL
13
Sheet1
 
Upvote 0
Solution
See if this could be any use. To implement (test with a copy of your workbook) ..
1. With your workbook active press Alt+F11 to bring up the vba window.
2. In the Visual Basic window use the menu to Insert|Module
3. Copy and Paste the code below into the main right hand pane that opens at step 2.
4. Close the Visual Basic window.
5. Press Alt+F8 to bring up the Macro dialog
6. Select the macro & click ‘Run’
7. Your workbook will need to be saved as a macro-enabled workbook (*.xlsm)

VBA Code:
Sub GetArticles()
  Dim d As Object
  Dim a As Variant, b As Variant
  Dim i As Long, k As Long
 
  Set d = CreateObject("Scripting.Dictionary")
  With Sheets("Master Data")
    a = .Range("A2", .Range("B" & Rows.Count).End(xlUp)).Value
  End With
  With Sheets("Sheet1")
    b = .Range("A2", .Range("A" & Rows.Count).End(xlUp)).Value
    For i = 1 To UBound(b)
      d(Left(b(i, 1), 7)) = 1
    Next i
    ReDim b(1 To UBound(a), 1 To 2)
    For i = 1 To UBound(a)
      If d.exists(Left(a(i, 1), 7)) Then
        k = k + 1
        b(k, 1) = a(i, 1)
        b(k, 2) = a(i, 2)
      End If
    Next i
    Application.ScreenUpdating = False
    With .Range("B2:C2").Resize(k)
      .Value = b
      .Sort Key1:=.Columns(1), Order1:=xlAscending, Header:=xlNo
      .Rows(0).Value = Array("Article Number", "Description")
      .EntireColumn.AutoFit
    End With
    Application.ScreenUpdating = True
  End With
End Sub

This was my sample 'Master Data'

Pisky.xlsm
AB
1Article NumberDescription
26112167005TARN BSW513 INDIGO STRIPE L Indigo 2XL
36112166004TARN MYS604 BLK HOME OF THE J Black XL
46112166002TARN MYS604 BLK HOME OF THE JA Black M
56112167004TARN BSW513 INDIGO STRIPE LS Indigo XL
66112166001TARN MYS604 BLK HOME OF THE JA Black S
76112166003TARN MYS604 BLK HOME OF THE JA Black L
86112167003TARN BSW513 INDIGO STRIPE LS Indigo L
96112167002TARN BSW513 INDIGO STRIPE LS Indigo M
106666666666Data 1
116112167001TARN BSW513 INDIGO STRIPE LS Indigo S
125555555555Data 7
136112166005TARN MYS604 BLK HOME OF THE Black 2XL
Master Data


My 'Sheet1' just had what you see in column A before the code was run and the code produced what you see in columns B:C

Pisky.xlsm
ABC
1ArticleArticle NumberDescription
261121660045555555555Data 7
355555555556112166001TARN MYS604 BLK HOME OF THE JA Black S
461121670016112166002TARN MYS604 BLK HOME OF THE JA Black M
588888888886112166003TARN MYS604 BLK HOME OF THE JA Black L
66112166004TARN MYS604 BLK HOME OF THE J Black XL
76112166005TARN MYS604 BLK HOME OF THE Black 2XL
86112167001TARN BSW513 INDIGO STRIPE LS Indigo S
96112167002TARN BSW513 INDIGO STRIPE LS Indigo M
106112167003TARN BSW513 INDIGO STRIPE LS Indigo L
116112167004TARN BSW513 INDIGO STRIPE LS Indigo XL
126112167005TARN BSW513 INDIGO STRIPE L Indigo 2XL
13
Sheet1
You sir... Are a legend. Thank you! That works flawlessly!
 
Upvote 0
You're welcome. Thanks for the confirmation. :)

How did it go for speed with your huge data? I must admit I only tested up to about 70,000 rows not 700,000 and my data may not be representative of yours.
 
Upvote 0
You're welcome. Thanks for the confirmation. :)

How did it go for speed with your huge data? I must admit I only tested up to about 70,000 rows not 700,000 and my data may not be representative of yours.
It worked fine for the 700,000 rows! Didn’t take too long to load. I tested it with 6 separate article numbers and it loaded them all flawlessly. Appreciate your work man! Smarter guy than I am! 😂
 
Upvote 0

Forum statistics

Threads
1,215,011
Messages
6,122,677
Members
449,092
Latest member
tayo4dgacorbanget

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top