Method to transpose vertically stacked, variable sized, subtotaled data from an HTML format to an Excel range suitable for Pivot Tables

curtishavak

New Member
Joined
May 31, 2012
Messages
41
Firstly, apologies for the novel, but I don't know any other way to explain it...

I need to import a (poorly-formatted) HTML report generated from Oracle into Excel. I do this on a regular basis and have developed a technique and even a few basic recorded (I have zero VBA knowledge) macros. However, one of the key things that I run into is the need to use autofill for non consistent number of cells.

I start with a copy paste of the HTML text into Excel, cell A1. I get something that looks like this:
A / B / C / D
1) Total for Posting Year/Month: 2010/04 $4,000
2) Total for Posting Year/Month: 2010/06 $5,000
3) Total for Posting Year/Month: 2010/09 $6,000
4) Total for Customer: ABC, Inc. $15,000
5) Total for Market Segment: Aerospace $93,134
6) Total for Customer Class: Industrial $405,231

All text/values are in column A (A1, A2, A3, etc.) and this is a very small snippet of a much larger report (thousands of lines like this).

Basically this is a descending summation: All totals for each posting are summed from lines of orders that shipped in that month/year and these shipments are all for the customer that comes below them in the list (ABC, Inc.) and ABC, Inc. and all such customers with posting months (non-consecutive, so note that if shipments were only in 4, 6 and 9, all other months are left out of the subtotals completely instead of giving zero values as place-holders) are summed into their Market Segment (Aerospace) and likewise all Market segments are then grouped by Customer Class (Industrial) which corresponds to a sales division.

My goal is to reduce the list to the most basic subtotal (posting year/month) with categories Customer name, Market Segment and Customer Class being listed to the right in their own columns so that I can use a Pivot Table. After a lot of playing with text-to-columns, filtering, indexing the rows and then flipping the whole list upside down so I can use Auto Fill, I get a list that looks like this:

A / B / C / D
Index/ "Totals for" / Category / Shipments
1) 6 / Total for Customer Class: / Industrial /$405,231
2) 5 / Total for Market Segment: / Aerospace / $93,134
3) 4 / Total for Customer: / ABC, Inc. / $15,000
4) 3 / Total for Posting Year/Month: / 2010/09 / $6,000
5) 2 / Total for Posting Year/Month: / 2010/06 / $5,000
6) 1 / Total for Posting Year/Month: / 2010/04 / $4,000


I then use filter on row B to, say, customer class. In this example, this would show one row, but imagine, again, that this contains thousands of values. I then then the following formula into E1:

= C1

And then use auto fill to copy the formula down into the filtered cells in Row E.

I then filter column B by the next "Total for", namely Market Segment and in Row F, use the same formula and use autofill to copy it down.

I do the same then for Customer Name in Row G.

Finally, I unfilter column B, highlight rows E, F and G, copy and paste special values to eliminate the formulas. This gives me something like this (note, I had to shorten some things to fit all this in this post and I couldn't fit in column G. I also added more rows to help illustrate what comes next):

A / B / C / D / E / F
Ind/ "Totals for" / Category / Shipments / Cust. Class / Mkt. Seg
1)12 / T...Customer Class: / Medical /$405,231 / Medical /
2)11 / T...Market Segment:/ Neuro / $93,134 / /Neuro
3)10 / T...Customer: / EFG, Inc. / $15,000 / /
4)9 / T...Year/Month: / 2010/08 / $6,000 / /
5)8 / T...Year/Month: / 2010/09 / $5,000 / /
6)7 / T...Year/Month: / 2010/012 / $4,000 / /
7)6 / T...Customer Class: / Industrial /$405,231 / Industrial /
8)5 / T...Market Segment:/ Aerospace / $93,134 / /Aerospace
9)4 / T...Customer: / ABC, Inc. / $15,000 / /
10)3 / T...Year/Month: / 2010/09 / $6,000 / /
11)2 / T...Year/Month: / 2010/06 / $5,000 / /
12)1 / T...Year/Month: / 2010/04 / $4,000 / /

Now, for shorter reports, I have just been doing the highlighting cell E1, double-clicking the bottom right corner to auto fill down the value until it stops at the next cell with a value already in it, hit CTRL+Down Arrow, then repeat. I then to the same for columns F, G, etc. This yields the following:


A / B / C / D / E / F
Ind/ "Totals for" / Category / Shipments / Cust. Class / Mkt. Seg
1)12 / T...Customer Class: / Medical /$405,231 / Medical /
2)11 / T...Market Segment:/ Neuro / $93,134 / Medical /Neuro
3)10 / T...Customer: / EFG, Inc. / $15,000 / Medical /Neuro
4)9 / T...Year/Month: / 2010/08 / $6,000 / Medical /Neuro
5)8 / T...Year/Month: / 2010/09 / $5,000 / Medical /Neuro
6)7 / T...Year/Month: / 2010/012 / $4,000 / Medical /Neuro
7)6 / T...Customer Class: / Industrial /$405,231 / Industrial /Neuro
8)5 / T...Market Segment:/ Aerospace / $93,134 / Industrial /Aerospace
9)4 / T...Customer: / ABC, Inc. / $15,000 / Industrial /Aerospace
10)3 / T...Year/Month: / 2010/09 / $6,000 / Industrial /Aerospace
11)2 / T...Year/Month: / 2010/06 / $5,000 / Industrial /Aerospace
12)1 / T...Year/Month: / 2010/04 / $4,000 / Industrial /Aerospace

note that Row 7 then contains an inaccuracy (Industrial and Neuro together), but this doesn't matter because this row and those like it are removed by the final step wherein I filter column B to show only "Total for Posting Year/Month", which are then the only unique values containing all of the categories across the rows. I highlight the whole sheet while filtered, and copy into a new sheet to eliminate all the duplicates, yielding this:

A / B / C / D / E / F /G
Ind/ "Totals for" / Category / Shipments/ Cust. Class / Mkt. Seg /Cust.
1)9 / T...Year/Month:/ 2010/08 / $6,000 / Medical /Neuro /EFG Co
2)8 / T...Year/Month:/ 2010/09 / $5,000 / Medical /Neuro /EFG Co
3)7 / T...Year/Month:/ 2010/12 / $4,000 / Medical /Neuro /EFG Co
4)3 / T...Year/Month:/ 2010/09 / $6,000 / Industrial /Aerospace/ABC Co
5)2 / T...Year/Month:/ 2010/06 / $5,000 / Industrial /Aerospace /ABC Co
6)1 / T...Year/Month:/ 2010/04 / $4,000 / Industrial /Aerospace/ABC Co

Afterwards, I split the column C into separate year and month columns, but finally have the individual lines that I need, giving me something a pivot table can digest so I can make YTD comparisons, track customer, divisional and market segment growth, etc.

NOW (assuming you're still reading)... my question is this:

For really long reports, double clicking each cell, hitting control down, REPEAT...is an excruciatingly slow, monotonous task. All the other steps are relatively easy and can even be macro'ed.

Is there a way to automate this step?

I don't know enough about VBA to macro it myself, and when I try recording a Macro, my problem is non-uniform number of rows to fill. For example if one customer ships in Jan, Feb, May and Dec (4 months) and another in March, October and December (3 months), the auto fill doesn't work because when using absolute reference, it records the cell references, not that I want it to copy and paste into all blank cells between the current cell and the next cell with a value in it. Using relative reference yields the same problem only that it pastes in the number of cells I tell it to, so I can't build in variation. Plus each report I run is going to be a different length, so I can't tell it to loop X number of times (even if it did work); I need it to loop until there are simply no more values beneath it.

Alternatively, if there is no way to automate the step, does anyone have any ideas on an alternative way to transpose date organized in such a manner???

I have a 10,000+ line :eek: report that my Sr. VP has asked for and would prefer not to give myself carpal tunnel with all the double-clicks this would otherwise entail.

GO!
 

Excel Facts

VLOOKUP to Left?
Use =VLOOKUP(A2,CHOOSE({1,2},$Z$1:$Z$99,$Y$1:$Y$99),2,False) to lookup Y values to left of Z values.
Can you supply a sample workbook? You can use a service like box.com or google docs.

In it, show the raw data sheet. This would represent the imported HTML file.

Create each addition sheet that you would like created from the raw data.

From what I can gleen, you want a Year/Month sheet, a Customer Sheet, and a Market Segment Sheet?
 
Upvote 0
Can you supply a sample workbook? You can use a service like box.com or google docs.

In it, show the raw data sheet. This would represent the imported HTML file.

Create each addition sheet that you would like created from the raw data.

From what I can gleen, you want a Year/Month sheet, a Customer Sheet, and a Market Segment Sheet?


Finally got back to this one. See link the following link to my example file:

https://docs.google.com/spreadsheet/ccc?key=0Avz0joRLczLddE1SRDQ4Ri1WeUZYQXdBbDdKcXYyd1E


There are 3 tabs: Where the Problem Starts; How I fix it manually; End product, ready for pivot.

Per my initial novel of a post, I currently use a combination of double-click to autofill and CTRL+Down Arrow to get the to next Customer Name or Shipment Date and then repeat (and then filter down to one unique value, copy and paste into a new sheet), but this takes forever with longer reports. Unfortunately, I don't know how to write VBA code, just record macros, and I can't find a way to make Excel understand that I want to fill all the blank cells beneath a cell with that cell's value.

Please help!
 
Upvote 0
Here is code for the 'autofill' problem. Duplicate the code and change the range for column H.

Code:
Sub FillUpColG()
    Dim WS As Worksheet
    Dim lRow As Long
    Dim aCell As Range

    Set WS = Sheet1
    With WS
        lRow = .Cells(.Rows.Count, "F").End(xlUp).Row

        For Each aCell In Range("G3:G" & lRow)

            If aCell.Offset(1, 0).Value = Empty Then
                aCell.Offset(1, 0).Value = aCell.Value
            End If

        Next
    End With
End Sub

Auto filter can handle the rest, or did you want something more?
 
Upvote 0
David - thanks for responding. The code you gave me works, but only for the first two blank cells. I thought possibly it was something to do with the fact that there were two blank rows beneath "Customer A", but when I eliminate the extra row, it still only fills in those two values.

Any thoughts?

Example below:


Before:
COLUMN G
Customer Name
CustomerName
CUSTOMER A
CUSTOMER A
(Blank)
CUSTOMER B
(Blank)
CUSTOMER C
(Blank)
CUSTOMER D
(Blank)
CUSTOMER E
(Blank)

After:
COLUMN G
Customer Name
(Blank)
CUSTOMER A
(Blank)
(Blank)
CUSTOMER B
(Blank)
CUSTOMER C
(Blank)
CUSTOMER D
(Blank)
CUSTOMER E
(Blank)

<tbody>
</tbody>

<tbody>
</tbody><colgroup><col></colgroup>
 
Upvote 0
Just noticed that the before/after is reversed in my previous comment. Please help! So close, but still so far away!
 
Upvote 0

Forum statistics

Threads
1,214,653
Messages
6,120,751
Members
448,989
Latest member
mariah3

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top