Merging multiple sheets - highlight discrepancies

StevenRB

New Member
Joined
Jul 1, 2014
Messages
1
Hello Everyone,

From what I thought would be a straight forward function, this has had me scratching my head far too often this week. Until now, I've been manually sorting and copy/pasting in data, but this enhances the possibilities for errors. I've been trying to use a vlookup and index/match solutions, but with the size of the dataset, it becomes unstable. So I'm looking for a VBA solution. Or if Excel already has a feature like this inplace, that would be good too.

I Apologise for the length of the post, I tried to make it as informative as possible :)

The issue I'm having is that I want to merge several sheets of data into one master file, but each file may have redundant data, duplicated data, incorrect data and so on. Below is an example of the issue

I have created a Master_file and include information of the subjects. ‘Previous IDs’ have been converted into their current/‘running Id’s’, in some cases more than one ‘running ID’ has been made from a ‘Previous ID’. A ‘Previous ID’ will only have one genotype for each genotype group, however due to mistakes in generating data, ‘running Ids’ from the same ’Previous ID’ may show different results (One thing I want to detect).
I then want to merge all of the results into the one Master file, while highlighting the erroneous data:

Sheet1

*ABCDEFG
1Master_file:******
2Running_IDPrevious_IDGenderGenotype1Genotype2Genotype3Genotype4
351100M****
452100M****
553101F****
654102M****
755103F****
856103F****
957103F****

<colgroup><col style="font-weight:bold; width:30px; "><col style="width:146px;"><col style="width:84px;"><col style="width:77px;"><col style="width:75px;"><col style="width:75px;"><col style="width:75px;"><col style="width:75px;"></colgroup><tbody>
</tbody>


Excel tables to the web >> Excel Jeanie HTML 4


Sheet1

*ABCDEFG
11Datasheet1:******
12Running_ID **Genotype1Genotype2Genotype3Genotype4
1351**10**
1453**01**
1557**11**

<colgroup><col style="font-weight:bold; width:30px; "><col style="width:146px;"><col style="width:84px;"><col style="width:77px;"><col style="width:75px;"><col style="width:75px;"><col style="width:75px;"><col style="width:75px;"></colgroup><tbody>
</tbody>


Excel tables to the web >> Excel Jeanie HTML 4


Sheet1

*ABCD
17Datasheet2:***
18Running_ID Genotype2Genotype3Genotype4
1952110
2053101
2155111

<colgroup><col style="font-weight:bold; width:30px; "><col style="width:146px;"><col style="width:84px;"><col style="width:77px;"><col style="width:75px;"></colgroup><tbody>
</tbody>


Excel tables to the web >> Excel Jeanie HTML 4


Sheet1

*ABCDE
24Datasheet3:****
25Running_ID Previous_ID Genotype2Genotype3Genotype4
26*100110
27*102101
28*103-11

<colgroup><col style="font-weight:bold; width:30px; "><col style="width:146px;"><col style="width:84px;"><col style="width:77px;"><col style="width:75px;"><col style="width:75px;"></colgroup><tbody>
</tbody>


Excel tables to the web >> Excel Jeanie HTML 4

The order I feel it would have to happen in is that Datasheet3 will import first and lay down the background (because the Previous Id’s will cover several running_IDs).

Then each datasheet set will be imported to its respective running ID (They will not always be in the same order), and the data for the genotype (Will not always be in the same order, but will always have the same name) will be compared follow these rules:

If the importing genotype is the same = don’t import,
if the importing genotype is not existing yet = import,
if the importing genotype shows “-“ or “N/A” or etc=import,
if the importing genotypes are different = flag/Mark/Highlight etc)

To give a result like the table below:

Sheet1

*ABCDEFG
32Updated_Master_File:******
33Running_IDPrevious_IDGender Genotype1Genotype2Genotype3Genotype4
3451100M11/010
3552100M0110
3653101F0101
3754102M*101
3855103F1111
3956103F*-11
4057103F*-11

<colgroup><col style="font-weight:bold; width:30px; "><col style="width:146px;"><col style="width:84px;"><col style="width:77px;"><col style="width:75px;"><col style="width:75px;"><col style="width:75px;"><col style="width:75px;"></colgroup><tbody>
</tbody>


Excel tables to the web >> Excel Jeanie HTML 4


The Database I have is expanding, with additional datasheets being added to it. At the moment I have about 4400 rows by 280 columns. I’m happy to re-run the analysis everytime for an 'Updated_Master_file'.

I would really appreciate if someone could help me with this, or could point me in the right direction.

Thank you
Steve.
 

Excel Facts

Best way to learn Power Query?
Read M is for (Data) Monkey book by Ken Puls and Miguel Escobar. It is the complete guide to Power Query.

Forum statistics

Threads
1,215,247
Messages
6,123,853
Members
449,129
Latest member
krishnamadison

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top