Combine Based on Common Column
June 14, 2018 - by Bill Jelen
David from Florida asks today's question:
I have two workbooks. Both have the same data in column A, but the remaining columns are different. How can I merge those two workbooks?
I asked David if it is possible that one workbook has more records than the other. And the answer is Yes. I asked David if the key field only appears once in each file. The answer is also yes. Today, I will solve this with Power Query. The Power Query tools are found in Windows versions of Excel 2016+ in the Get & Transform section of the Data tab. If you have Windows versions of Excel 2010 or Excel 2013, you can download the Power Query add-in for those versions.
Here is David's workbook 1. It has Product and then three columns of data.
Here is David's workbook 2. It has Product Code and then other columns. In this example, there are extra products in workbook2, but the solutions will work if either workbook has extra columns.
Here are the steps:
Select Data, Get Data, From File, From Workbook:
- Browse to the first workbook and click OK
- In the Navigator dialog, choose the worksheet on the left. (Even if there is only one worksheet, you have to select it.) You will see the data on the right.
- In the Navigator dialog, open the Load dropdown and choose Load To...
- Choose Only Create a Connection and press OK.
Repeat steps 1-5 for the second workbook.
If you've done both workbooks, you should see two connections on the Queries & Connections Panel on the right of your Excel screen.
Continue with the steps to merge the workbooks:
Data, Get Data, Combine Queries, Merge.
- From the top drop down in the Merge dialog, choose the first query.
- From the second drop down in the Merge dialog, choose the second query.
- Click on the Product heading in the top preview (this is the key field. Note you can multi-select two or more key fields by Ctrl + Clicking)
- Click on the Product Code heading in the second preview.
Open the Join Type and choose Full Outer (All Rows From Both)
Click OK. The data preview does not show the extra rows and only shows "Table" repeatedly in the last column.
- Notice there is an "Expand" icon in the heading for DavidTwo. Click that icon.
Optional, but I always unselect "Use Original Column Name As Prefix". Click OK.
The results are shown in this preview:
- In Power Query, use Home, Close & Load.
Here is the beautiful feature: if the underlying data in either workbook changes, you can click the Refresh icon to pull new data in to the results workbook.
The icon for Refresh is usually hidden. Drag the left edge of the Queries & Connections pane to the left to reveal the icon.
Learn Excel from MrExcel Podcast, Episode 2216: Combine Two Workbooks Based on a Common Column.
Hey, welcome back to MrExcel netcast, I'm Bill Jelen. Today's question's from David, who was in my seminar in Melbourne, Florida, for the Space Coast Chapter of the IIA.
David has two different workbooks where Column A is in common between both of them. So, here's Workbook 1, here's Workbook 2-- both have product code. This one has items that the first one doesn't have, or vice versa, and David wants to combine all the columns. So, we have three columns here and four columns here. I put both of these in the same workbook, in case you're downloading the workbook to work along. Take each one of these, move it out to its own workbook and save it.
Alright, to combine these files, we're going to use Power Query. Power Query's built into Excel 2016. If you're in the Windows version of 10 or 13, you can go out to Microsoft and download Power Query. You can start from a new blank workbook with a blank worksheet. You're going to save this file-- Save as, you know, maybe Workbook, to show the results of combined files .xlsx. Alright? And what we're going to do is, we're going to do two queries. We're going to go to Data, Get Data, From File, From Workbook, and then we'll choose the first file. In a preview, select the sheet that has your data, and we don't have to do anything to this data. So just open the load box and choose Load To, Only Create Connection, click OK. Perfect. Now, we're going to repeat that for the second item-- Data, From File, From a Workbook, choose DavidTwo, choose the sheet name, and then open the load, Load To, Only Create a Connection. You'll see over here in this panel, we have both connections present. Alright.
Now the actual work-- Data, Get Data, Combine Queries, Merge, and then in the Merge dialog, choose DavidOne, DavidTwo, and this next step is completely unintuitive. You have to do this. Choose the column or columns in common-- so Product and Product. Alright. And then, be very careful here with the join type. I want all rows from both because one might have an extra row and I need to see that, and then we click OK. Alright. And here's the initial result. It doesn't look like it worked; it doesn't look like it added the extra items that were in file 2. And we have this column 5-- it's null now. I'm going to right click column 5 and say, Remove that column. So open this expand icon and uncheck this box for Use original column name as prefix, and BAM! it works. So the extra items that were in File 2, that aren't in File 1, do appear.
Alright. Now in today's file, it looks like this Product Code column is better than this Product column, because it has extra rows. But there might be a day in the future where Workbook 1 has things that Workbook 2 doesn't have. So I'm going to leave both of them there, and I'm not going to get rid of any nulls because, like, even though this row at the bottom appears to be completely null, there might be in the future a situation where we have a few nulls in here because something's missing. Alright? So, finally, Close & Load, and we have our sixteen rows.
Now, in the future, let's say that something changes. Alright, so we'll go back to one of those two files and I'll change the class for Apple to 99, and let's even insert something new and save this workbook. Alright. And then, if we want our merge file to update, come over here-- now, watch out, when you do this the first time, you can't see the Refresh icon-- you have to grab this bar and drag it over. And we will do Refresh, and 17 rows loaded, the watermelon appears, the Apple changes to 99-- it's a beautiful thing. Now, hey, do you wanna learn about Power Query? Buy this book by Ken Puls and Miguel Escobar, M is for (DATA) MONKEY. I'll get you up to speed.
Wrap-up today: David from Florida has two workbooks that he wants to combine; they both have the same fields in Column A, but the other columns are all different; one workbook might have extra items that are not in the other and David wants those; there's no duplicates in either file; we're going to use power query to solve this, so start in a new blank workbook on a blank worksheet; you're going to do three queries, first one-- Data, From File, Workbook, and then Load to only Created Connection; the same thing for the second workbook, and then Data, Get Data, Merge, select the two connections, select the column that's common in both--in my case, Product-- and then from the Join Type, you want to full join all from the File 1, all from File 2. And then the beautiful thing is if the underlying data changes, you can just refresh the query.
To download the workbook from today's video, visit the URL in the YouTube description.
Well, hey, I want like David for showing up for my seminar, I want to thank you for stopping by. I'll see you next time for another netcast from MrExcel.
Download Excel File
To download the excel file: combine-based-on-common-column.xlsx
Power Query is an amazing tool in Excel.
Excel Thought Of the Day
I've asked my Excel Master friends for their advice about Excel. Today's thought to ponder:
"Always press F4 when you read range or matrix in a function"
Title Photo: Eloise Ambursley on Unsplash