How to calculate matching values in two cells to see percentage of accuracy?

eric64418

New Member
Joined
Jan 19, 2022
Messages
8
Office Version
  1. 2021
Platform
  1. Windows
I have a list of CDs and LPs with UPCs that I found matching release pages for on the site Discogs.com - but I want to double check my work.

My Sheet has the UPC, Title, Release ID, and Description. I want to see the % of accuracy between Columns B & D are similar - knowing that some have some errant text such as (180g) or something - i would expect a rate of 80-90% means I am correct, but a percentage lesser than that needs investigation.

I have around 11k of these codes, so doing this manually would be quite a pain.

Any help would be MOST appreciated.

Thank you!!
Eric
 

Attachments

  • Screenshot 2022-01-19 230919.png
    Screenshot 2022-01-19 230919.png
    106.2 KB · Views: 74
Well hopefully we got there in the end, there were no unusual characters in the strings, but the pictures didn't show enough detail about the strings. Initially I tried searching for the last word in string D & using that as a cut off for the Mid function in B & whilst that worked for the vast majority some records had repetitions like “(Extended Mix)” appearing twice in B & D which caused the formula to cut off at the first occurrence. In the end the answer was a much simpler formula, it now looks for the same initial start position after the Artist but then extracts the length of string D from that point & checks for a match with D. In the examples provided there is only one “No Match”, I have added column F (Extraction) just so you can physically see if they match, but you can just ignore it in your worksheet it is just for demonstration purposes.


Discogs.xlsb
ABCDEF
1UPCFullDescriptionDiscogsReleaseIDDescription2MatchExtraction
254645665414Guidance (ltd. ed.) (picture disc)r15418244GuidanceMatchGuidance
354645635516Cutty Ranks - Grizzle / Cocoa Tea - Sonia (rhythm: "Punanny")r505655Grizzle / Cocoa Tea - SoniaMatchGrizzle / Cocoa Tea - Sonia
454645634113Cutty Ranks - Limb By Limb (Vocal); Cutty Ranks - Limb By Limb (Acapella) / Cutty Ranks - Limb By Limb (Hip Hop Mix) (rhythm: "Pitch")r1913711Limb By Limb (Vocal); Cutty Ranks - Limb By Limb (Acapella) / Cutty Ranks - Limb By Limb (Hip Hop Mix)MatchLimb By Limb (Vocal); Cutty Ranks - Limb By Limb (Acapella) / Cutty Ranks - Limb By Limb (Hip Hop Mix)
554645635912Dawn Penn - You Don't Love Me (No No No) / No No No Version (rhythm: "Dawn Penn - You Don't Love Me / AKA No No No")r2811476You Don't Love Me (No No No) / No No No VersionMatchYou Don't Love Me (No No No) / No No No Version
654645636513Devonte & Peter Metro - Say Wooee / Beenie Man - Bookshelf (rhythm: "Bookshelf")r1664807Say Wooee / Beenie Man - BookshelfMatchSay Wooee / Beenie Man - Bookshelf
754645637718Foxy Brown - Sorry; Version 1 / Version 2r902855Sorry; Version 1 / Version 2MatchSorry; Version 1 / Version 2
854645652414Gregory Isaacs - Mr. Know It All (Extended Mix) / Ossie Hibbert, Revolutionaries - War Of The Starsr5609275Mr. Know It All (Extended Mix) / Ossie Hibbert, Revolutionaries - War Of The StarsMatchMr. Know It All (Extended Mix) / Ossie Hibbert, Revolutionaries - War Of The Stars
954645637916Gregory Isaacs - Rumours / Version (rhythm: "Telephone Love AKA Rumors")r1584432Rumours / VersionMatchRumours / Version
1054645636216Ini Kamoze - Hot Stepper / Stepper Versionr656098Hot Stepper / Stepper VersionMatchHot Stepper / Stepper Version
1154645653510Itals - Ina Dis Ya Time / Yellowman - Operation Eradication; Harbour View Rock (Instrumental)r7003747Ina Dis Ya Time / Yellowman - Operation Eradication; Harbour View Rock (Instrumental)MatchIna Dis Ya Time / Yellowman - Operation Eradication; Harbour View Rock (Instrumental)
126.01811E+11Johnny Osbourne - Fally Ranking (Extended Mix) / Johnny Osbourne - Trechtown School (Extended Mix)r12108077Fally Ranking (Extended Mix) / Johnny Osbourne - Trechtown School (Extended Mix)MatchFally Ranking (Extended Mix) / Johnny Osbourne - Trechtown School (Extended Mix)
1354645637312Lady G - Nuff Respect / Version (rhythm: "Telephone Love")r1796465Nuff Respect / VersionMatchNuff Respect / Version
146.01811E+11Michael Prophet - Gunman (Extended Mix) / Michael Prophet - Cassandra (Extended Mix) (rhythm: "Michael Prophet - Gunman")r2334396Gunman (Extended Mix) / Michael Prophet - Cassandra (Extended Mix)MatchGunman (Extended Mix) / Michael Prophet - Cassandra (Extended Mix)
1554645634212Michigan & Smiley - Diseases (Extended Mix) / Nicodemus - Boneman Connection (Extended Mix) (rhythm: "Diseases AKA Golden Hen")r1585354Diseases (Extended Mix) / Nicodemus - Boneman Connection (Extended Mix)MatchDiseases (Extended Mix) / Nicodemus - Boneman Connection (Extended Mix)
166.30283E+11Reggae Regular - Where Is Jah (Extended Mix) / Reggae Regular - Black Star Liner (Extended Mix)r14652106Where Is Jah (Extended Mix) / Reggae Regular - Black Star Liner (Extended Mix)MatchWhere Is Jah (Extended Mix) / Reggae Regular - Black Star Liner (Extended Mix)
1754645636117Reggie Stepper - Kimbo King / Whining Skill (rhythm: "Stalag")r2509059Kimbo King / Whining SkillMatchKimbo King / Whining Skill
1854645633611Shabba Ranks - Wicked In Bed / Singing Melody - Groovy Kind Of Lover1788202Wicked In Bed / Singing Melody - Groovy Kind Of LoveMatchWicked In Bed / Singing Melody - Groovy Kind Of Love
1954645633512Sister Nancy - Bam Bam / Stalag Version (rhythm: "Stalag")r163058Bam Bam / Stalag VersionMatchBam Bam / Stalag Version
1215.29101E+1210000 Maniacs - Planned Obsolescence: Live '88 (remastered)r9882267Live At The Ritz NY 7th August '87No MatchPlanned Obsolescence: Live '88 (re
Sheet1
Cell Formulas
RangeFormula
E2:E19,E121E2=IF(D2=(IF(LEN(B2)>LEN(SUBSTITUTE(B2,"- ","")),TRIM(MID(B2,FIND("- ",B2)+1,LEN(D2)+1)),LEFT(B2,LEN(D2)))),"Match","No Match")
F2:F19,F121F2=IF(LEN(B2)>LEN(SUBSTITUTE(B2,"- ","")),TRIM(MID(B2,FIND("- ",B2)+1,LEN(D2)+1)),LEFT(B2,LEN(D2)))
 
Upvote 0

Excel Facts

How to calculate loan payments in Excel?
Use the PMT function: =PMT(5%/12,60,-25000) is for a $25,000 loan, 5% annual interest, 60 month loan.
Thank you so much.... you've saved me a TON of time manually checking all of these. Really appreciate your efforts!!
 
Upvote 0
Hi, I had a bit of trouble - i left another file in the google drive folder. Could you take a look?
 
Upvote 0
Hi, I have looked at the new sheet & whilst it is obvious that the columns relate to the same record (see below), there are no discernible patterns, the columns have different punctuation (line breaks etc..) & the basic configuration is too different to create a match, I will have a think & take a proper look sometime over the weekend.

Diseases (Extended Mix) / Nicodemus - Boneman Connection (Extended Mix) Michigan & Smiley

Diseases / Boneman Connection Michigan & Smiley /

Nicodemus
 
Upvote 0
Hi, I have looked at the new sheet & whilst it is obvious that the columns relate to the same record (see below), there are no discernible patterns, the columns have different punctuation (line breaks etc..) & the basic configuration is too different to create a match, I will have a think & take a proper look sometime over the weekend.

Diseases (Extended Mix) / Nicodemus - Boneman Connection (Extended Mix) Michigan & Smiley

Diseases / Boneman Connection Michigan & Smiley /

Nicodemus
Thank you so much for taking the time to think about this.... I can restructure either 'entry' - but still the things like (Extended Mix) will be in there... Yeah I'm not sure if this is possible in excel... definitely a tough one.
 
Upvote 0
You might want to give the Microsoft Excel Add-In Fuzzy Lookup a try.
You can it from Microsoft here.
Download Fuzzy Lookup Add-In for Excel from Official Microsoft Download Center

Power Query also has Fuzzy Lookup but the Add-In shows the percentage match which you are after while the PQ version does not.

Below are
1) the first 20 lines of you sample file with the % match up.
2) unmatched lines (2)
I set the minum match allowed to 25% and return only 1 result for each line (the match with the largest %)
Excluding the 2 unmatched ones the minimum match was given as 82%

1st 20 Lines of your Example file

20220122 Fuzzy Match sample 001.xlsx
EFGHI
1UPCFullDescriptionDiscogsReleaseIDDescription2Similarity
254645665414Black Ark Players, Lee Perry - Guidance (ltd. ed.) (picture disc)r15418244Guidance0.8200
354645635516Cutty Ranks - Grizzle / Cocoa Tea - Sonia (rhythm: "Punanny")r505655Grizzle / Cocoa Tea - Sonia0.9026
454645634113Cutty Ranks - Limb By Limb (Vocal); Cutty Ranks - Limb By Limb (Acapella) / Cutty Ranks - Limb By Limb (Hip Hop Mix) (rhythm: "Pitch")r1913711Limb By Limb (Vocal); Cutty Ranks - Limb By Limb (Acapella) / Cutty Ranks - Limb By Limb (Hip Hop Mix)0.9638
554645635912Dawn Penn - You Don't Love Me (No No No) / No No No Version (rhythm: "Dawn Penn - You Don't Love Me / AKA No No No")r2811476You Don't Love Me (No No No) / No No No Version0.8845
654645636513Devonte & Peter Metro - Say Wooee / Beenie Man - Bookshelf (rhythm: "Bookshelf")r1664807Say Wooee / Beenie Man - Bookshelf0.8958
754645637718Foxy Brown - Sorry; Version 1 / Version 2r902855Sorry; Version 1 / Version 20.9357
854645652414Gregory Isaacs - Mr. Know It All (Extended Mix) / Ossie Hibbert, Revolutionaries - War Of The Starsr5609275Mr. Know It All (Extended Mix) / Ossie Hibbert, Revolutionaries - War Of The Stars0.9672
954645637916Gregory Isaacs - Rumours / Version (rhythm: "Telephone Love AKA Rumors")r1584432Rumours / Version0.8400
1054645636216Ini Kamoze - Hot Stepper / Stepper Versionr656098Hot Stepper / Stepper Version0.9329
1154645653510Itals - Ina Dis Ya Time / Yellowman - Operation Eradication; Harbour View Rock (Instrumental)r7003747Ina Dis Ya Time / Yellowman - Operation Eradication; Harbour View Rock (Instrumental)0.9825
126.01811E+11Johnny Osbourne - Fally Ranking (Extended Mix) / Johnny Osbourne - Trechtown School (Extended Mix)r12108077Fally Ranking (Extended Mix) / Johnny Osbourne - Trechtown School (Extended Mix)0.9654
1354645637312Lady G - Nuff Respect / Version (rhythm: "Telephone Love")r1796465Nuff Respect / Version0.8743
146.01811E+11Michael Prophet - Gunman (Extended Mix) / Michael Prophet - Cassandra (Extended Mix) (rhythm: "Michael Prophet - Gunman")r2334396Gunman (Extended Mix) / Michael Prophet - Cassandra (Extended Mix)0.9063
1554645634212Michigan & Smiley - Diseases (Extended Mix) / Nicodemus - Boneman Connection (Extended Mix) (rhythm: "Diseases AKA Golden Hen")r1585354Diseases (Extended Mix) / Nicodemus - Boneman Connection (Extended Mix)0.8986
166.30283E+11Reggae Regular - Where Is Jah (Extended Mix) / Reggae Regular - Black Star Liner (Extended Mix)r14652106Where Is Jah (Extended Mix) / Reggae Regular - Black Star Liner (Extended Mix)0.9672
1754645636117Reggie Stepper - Kimbo King / Whining Skill (rhythm: "Stalag")r2509059Kimbo King / Whining Skill0.9000
1854645633611Shabba Ranks - Wicked In Bed / Singing Melody - Groovy Kind Of Lover1788202Wicked In Bed / Singing Melody - Groovy Kind Of Love0.9609
1954645633512Sister Nancy - Bam Bam / Stalag Version (rhythm: "Stalag")r163058Bam Bam / Stalag Version0.8947
2054645640916Sizzla - Just One Of Those Days; Acoustic Mix / Solid As A Rock (rhythm: "Queen Majesty")r818187Just One Of Those Days; Acoustic Mix / Solid As A Rock0.9375
217.25543E+11Studio X - Los Kings Del Mambo (Digifunky's House Mix) / Studio X - Los Kings Del Mambo (DJ EFX's Hyped Live Mix) (orig. press)r182011Los Kings Del Mambo (Digifunky's House Mix) / Studio X - Los Kings Del Mambo (DJ EFX's Hyped Live Mix)0.9630
Fuzzy Match


Unmatched lines

Book2
ABCDE
1UPCFullDescriptionDiscogsReleaseIDDescription2Similarity
25.29101E+1210000 Maniacs - Planned Obsolescence: Live '88 (remastered)r98822670.0000
35.02839E+12Admiral Bailey - Best Of (20 tracks)r114556900.0000
Sheet1


I gave the 3 column table the name tblComplete and the other single column tablethe name tblMatchUp.
It outputted to the active cell, so make sure that it is on a new sheet or to the right or below and data on the current sheet.

1642825625537.png
 
Upvote 0
In the beginning you asked for an accuracy percentage between the columns & that is what we ended up with. I have uploaded the file to the Google Drive but this is just an explanation of what happens in the code.

None of the code affects your original data

New Function – type in E14

= WordMatchPercentage (B14, D14)

Code Order

Str = B

Str1 = D

Remove all characters except for letters, numbers & spaces from both columns

Str = "Say Wooee Beenie Man Bookshelf Devonte Peter Metro"

Str1 = "Say, Wodee, Bookshelf, Devonte, Tanto, Metro, Beenie, Man"

Create array of words from Str1

Say

Wodee

Bookshelf

Devonte

Tanto

Metro

Beenie

Man


Iterate though the list to check if the words are present in Str

Say Match

Wodee No Match (Str spelling Wooee)

Bookshelf Match

Devonte Match

Tanto No Match (Str includes Peter & not Tanto)

Metro Match

Beenie Match

Man Match


Calculate Match Percentage - Six words out of the eight Match

Column E = 75%


With the data provided you have 11,916 records that can be matched (2441 null records), out of those 9,315 are a 100% match, 11,249 are a >= 75% match.

If you enter the formula & drag it down it takes about a minute to complete the 14K rows, after you have all the data run the macro ‘FixValues’, that will just leave the values in column E, this stops the worksheet recalculating the formulas when you filter your data. If you want to leave the formulas, I suggest you switch the worksheet to manual calculation when you filter the data to prevent recalculation. This formula works perfectly well with the original data provided; you will just need to copy the module into your original sheet.
 
Upvote 0

Forum statistics

Threads
1,214,827
Messages
6,121,818
Members
449,049
Latest member
cybersurfer5000

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top