Transpose data as an array in vba

seattletimebandit

Board Regular
Joined
Apr 11, 2013
Messages
69
Hello,

I have a need to take raw laboratory data presented in columns and transpose as an array into a pivot table-like format. I have 600 rows of data that has a lot of repeating
values (Sample ID, Date, Chemical Name). So I need to loop through each "set" of samples and transpose the Sample ID's and Dates as columns and keep the Chemical Names as rows,
but not repeating the Chemical Names over and over (there only 10 chemicals), the Results will then populate the data field transposed into the appropriate corresponding cells.

I have something that works, but it has the chemical names hard-coded and I would like to be able to do this by transposing as a array, as chemical names can change (some samples may be analyzed for 4 chemicals only, while others might be analyzed for 64 chemicals). To be able to ask the user to select a range to transpose would be icing on the cake.

Thanks in advance!

My raw data looks like this:

Sample IDSample DateChemical NameResult
TP-01-4.503-Oct-13Benzene0.0228 U
TP-01-4.503-Oct-13Ethylbenzene0.0341 U
TP-01-4.503-Oct-13Toluene0.0228 U
TP-01-4.503-Oct-13mp-Xylene0.0228 U
TP-01-4.503-Oct-13o-Xylene0.0228 U
TP-01-4.503-Oct-13cis-12-Dichloroethene0.0228 U
TP-01-4.503-Oct-13trans-12-Dichloroethene0.0228 U
TP-01-4.503-Oct-13Tetrachloroethene (PCE)0.121
TP-01-4.503-Oct-13Trichloroethene (TCE)0.0228 U
TP-01-4.503-Oct-13Vinyl chloride0.00228 U
TP-01-8.504-Oct-13Benzene0.0228 U
TP-01-8.504-Oct-13Ethylbenzene0.0343 U
TP-01-8.504-Oct-13Toluene0.0228 U
TP-01-8.504-Oct-13mp-Xylene0.0228 U
TP-01-8.504-Oct-13o-Xylene0.0228 U
TP-01-8.504-Oct-13cis-12-Dichloroethene0.0228 U
TP-01-8.504-Oct-13trans-12-Dichloroethene0.0228 U
TP-01-8.504-Oct-13Tetrachloroethene (PCE)0.32
TP-01-8.504-Oct-13Trichloroethene (TCE)0.0228 U
TP-01-8.504-Oct-13Vinyl chloride0.00228 U
TP-02-4.505-Oct-13Benzene0.0291 U
TP-02-4.505-Oct-13Ethylbenzene0.0437 U
TP-02-4.505-Oct-13Toluene0.0291 U
TP-02-4.505-Oct-13mp-Xylene0.0291 U
TP-02-4.505-Oct-13o-Xylene0.0291 U
TP-02-4.505-Oct-13cis-12-Dichloroethene0.0291 U
TP-02-4.505-Oct-13trans-12-Dichloroethene0.0291 U
TP-02-4.505-Oct-13Tetrachloroethene (PCE)1.18
TP-02-4.505-Oct-13Trichloroethene (TCE)0.0291 U
TP-02-4.505-Oct-13Vinyl chloride0.00291 U

<tbody>
</tbody>

I need it to look like this:

Chemical NameTP-01-4.5TP-01-8.5TP-02-4.5
03-Oct-1304-Oct-1305-Oct-13
Benzene0.0228 U0.0228 U0.0291 U
Ethylbenzene0.0341 U0.0343 U0.0437 U
Toluene0.0228 U0.0228 U0.0291 U
mp-Xylene0.0228 U0.0228 U0.0291 U
o-Xylene0.0228 U0.0228 U0.0291 U
cis-12-Dichloroethene0.0228 U0.0228 U0.0291 U
trans-12-Dichloroethene0.1210.0228 U0.0291 U
Tetrachloroethene (PCE)0.0228 U0.321.18
Trichloroethene (TCE)0.00228 U0.0228 U0.0291 U
Vinyl chloride0.0228 U0.00228 U0.00291 U

<tbody>
</tbody>

<tbody>
</tbody>
 
Russell,

The SQL approach should work regardless of the number of rows of data in your dataset. You should just need to define MyData to include the entire dataset.

Perhaps there's a problem with blanks or the data type of some of cells. Can you identify the row at which that that the SQL approach stops working and see if there's something different about that row?
 
Upvote 0

Excel Facts

Get help while writing formula
Click the italics "fx" icon to the left of the formula bar to open the Functions Arguments dialog. Help is displayed for each argument.
This is a better code, but may not solve the "Error" problem.
As the code writes to the sheet Directly, then if you relate the place it stops in the results to your actual data, you should be able to see the approx line/s it fails on.
If you then try the code on just those lines and it still fails, then perhaps you could send those lines to give me something to work on.
Code:
[COLOR="Navy"]Sub[/COLOR] MG02Nov29
[COLOR="Navy"]Dim[/COLOR] dn      [COLOR="Navy"]As[/COLOR] Range
[COLOR="Navy"]Dim[/COLOR] Rng     [COLOR="Navy"]As[/COLOR] Range
[COLOR="Navy"]Dim[/COLOR] Dic     [COLOR="Navy"]As[/COLOR] Object
[COLOR="Navy"]Dim[/COLOR] col     [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
[COLOR="Navy"]Set[/COLOR] Rng = Range(Range("C2"), Range("c" & Rows.Count).End(xlUp))
 [COLOR="Navy"]With[/COLOR] CreateObject("Scripting.Dictionary")
 .CompareMode = 1
 [COLOR="Navy"]For[/COLOR] [COLOR="Navy"]Each[/COLOR] dn [COLOR="Navy"]In[/COLOR] Rng.Offset(, -2)
    [COLOR="Navy"]If[/COLOR] Not .exists(dn.Value & "," & dn.Offset(, 1)) [COLOR="Navy"]Then[/COLOR]
        col = col + 1
        .Add (dn.Value & "," & dn.Offset(, 1)), col
    [COLOR="Navy"]End[/COLOR] If
 [COLOR="Navy"]Next[/COLOR] dn
 
 [COLOR="Navy"]Set[/COLOR] Dic = CreateObject("Scripting.Dictionary")
    Dic.CompareMode = 1
   [COLOR="Navy"]For[/COLOR] [COLOR="Navy"]Each[/COLOR] dn [COLOR="Navy"]In[/COLOR] Rng
            [COLOR="Navy"]If[/COLOR] Not Dic.exists(dn.Value) [COLOR="Navy"]Then[/COLOR]
                [COLOR="Navy"]Set[/COLOR] Dic(dn.Value) = CreateObject("Scripting.Dictionary")
                
            [COLOR="Navy"]End[/COLOR] If
        
        [COLOR="Navy"]If[/COLOR] Not Dic(dn.Value).exists(dn.Offset(, -2).Value & dn.Offset(, -1).Value) [COLOR="Navy"]Then[/COLOR]
            Dic(dn.Value).Add (dn.Offset(, -2).Value & "," & dn.Offset(, -1).Value), dn.Offset(, 1)
        [COLOR="Navy"]End[/COLOR] If
    [COLOR="Navy"]Next[/COLOR] dn
   [COLOR="Navy"]Dim[/COLOR] k [COLOR="Navy"]As[/COLOR] Variant
   [COLOR="Navy"]Dim[/COLOR] p
   [COLOR="Navy"]Dim[/COLOR] c [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
 
    c = 2
    Cells(1, "F") = "Chemical Name"
    [COLOR="Navy"]For[/COLOR] [COLOR="Navy"]Each[/COLOR] k [COLOR="Navy"]In[/COLOR] Dic.Keys
        c = c + 1
        Cells(c, "F") = k
            [COLOR="Navy"]For[/COLOR] [COLOR="Navy"]Each[/COLOR] p [COLOR="Navy"]In[/COLOR] Dic(k)
               Cells(1, .Item(p) + 6) = Split(p, ",")(1)
               Cells(2, .Item(p) + 6) = Split(p, ",")(0)
               Cells(c, .Item(p) + 6) = Dic(k).Item(p)
            [COLOR="Navy"]Next[/COLOR] p
   [COLOR="Navy"]Next[/COLOR] k
[COLOR="Navy"]End[/COLOR] [COLOR="Navy"]With[/COLOR]
[COLOR="Navy"]End[/COLOR] [COLOR="Navy"]Sub[/COLOR]
Regards Mick
 
Upvote 0
Hey Mick,

It still busts on the same line of code in the second If Then statement in the second For loop. I stepped through line by line and it looped fine through the first For loop, went through the second For loop for quite a few iterations, then busted. I don't have any blank cells in the table and the data in each field is the same format. The one big difference is that I have Sample ID's that have 3 extra compounds farther down the table that include "Gasoline" "Diesel (Fuel Oil)", and "Heavy Oil". Could that cause a problem?

Another thing I forgot to mention was that the final table need to have the Sample ID in a cell above the Sample Date (see my table in my first posting). Jerry's sql query concatenated the two into one cell. I can't tell if your code is doing the same thing (I can usually kind of figure out what is going in VBA code that someone else writes, but yours is pretty cryptic to me). I realize it's an associative array (I looked it up), but I have little experience with arrays.

Is it possible to attach my full table to a posting here? If so, I don't see where to do that.

Russell
 
Upvote 0
Russell, Thanks for uploading the example data. You have some duplicate entries in the data. If there are duplicates for all 4 columns, those can be handled.

If you have any cases in which the first 3 columns are duplicated and the 4th column is different, (2 different results for the same test), you'd have to decide how that should be handled since both the SQL and Dictionary approaches are currently only expecting one unique result. In your example data, those 3-column duplicates are on Rows 191-192, and 261-262.

Based on your response to that I can post an SQL approach that will provide the headers on two rows like you want, but an SQL approach isn't a good fit if you are going to have tables with more than 255 columns. Your example data generates 96 columns which would work, but you noted earlier that you might have 3,000 rows of raw data which might generate more than 255 columns. What's the maximum number of Columns (Unique Test Code & Dates) you would need?
 
Last edited:
Upvote 0
Jerry,

Looks like I missed those duplicate rows during my query export from Access to Excel. They are caused by the laboratory running a duplicate analysis on the same sample to check the QA/QC of the instrumentation. I always delete the dup data record when running the query in Access, but sometimes one or two get through. Those are usually found when we are building tables for presentation in a report or for a chart in Excel. So, in short, duplicate results should not be a part of the final table.

While the size of final table can be rather large (>255 columns) on rare occasions, the lab data actually comes in smaller chunks. In the case of this table, I appended 14 separate imported data files into one table in Access, created the query for select fields and certain parameters, then exported the queried data table to Excel, which is what we are using here as the example table. Of the 14 data tables from the lab, the largest table has 168 rows (not sure how many columns, but way less than 255). Generally larger tables might be created from legacy data (years of sampling and analyses), which we do create on rare occasions, but mostly we are working with data that is coming in small sizes on a daily basis.

If there were any tables with >255 columns, I could parse those down to a manageable size and piece them together for the final table presentation.

(Just to through a wrench into this, sometimes, when like to switch the compounds to the column headers and the sample ID's/dates to the row headers!)

Did I answer that satisfactorily? It's Sunday and the extra hour of sleep has me feeling tired...

Russell
 
Upvote 0
.
It's Sunday and the extra hour of sleep has me feeling tired...

:LOL: I usually feel that way in the Spring when the clocks go forward.

The wrench of orienting the compounds as the column headers seems like a good approach. For most reporting and analysis activities (printing, filtering, scrolling....) having Rows > Columns works better.

BTW, since Mick is offline, I'll offer that I believe this tweak is all that's needed to get his Dictionary code to work...

Code:
If Not Dic(dn.Value).exists(dn.Offset(, -2).Value[B][COLOR="#006400"] & "," &[/COLOR][/B] dn.Offset(, -1).Value) Then
 
Upvote 0
Jerry,

Thanks for the fix to Mick's code. It worked! I figured it had to be some missing syntax that I simply could not see with my limited VBA experience, and as I stated above, Mick used some pretty cryptic (to me) code that I couldn't figure out what exactly was going on line-by-line. The run-time error I was getting was:

Run-time error '457':
This key is already associated with an element of this collection.


The only thing I see with my quick review of the new table is that the compound name is below the date. I'll see if I can tweak it to flip those two around. If I run into trouble, I'll send up a flare**.

I'm still interested in your Excel SQL query though (can it still work with this table as it is currently formatted (sans the duplicated rows)? My co-worker and I were excited to see the database query tool. More for us to play with and learn from!

**Just tried tweaking for the order of the headers (I reversed the cell references) and it worked:

Mick had:

Code:
[COLOR=#574123]Cells([/COLOR][B][COLOR=#008080]1[/COLOR][/B][COLOR=#574123], .Item(p) + 6) = Split(p, ",")([/COLOR][COLOR=#008080][B]1[/B][/COLOR][COLOR=#574123])
[/COLOR][COLOR=#574123]Cells([/COLOR][B][COLOR=#008080]2[/COLOR][/B][COLOR=#574123], .Item(p) + 6) = Split(p, ",")([/COLOR][B][COLOR=#008080]0[/COLOR][/B][COLOR=#574123])
[/COLOR]

I changed it to:

Code:
Cells([B][COLOR=#ff0000]1[/COLOR][/B], .Item(p) + 6) = Split(p, ",")([B][COLOR=#ff0000]0[/COLOR][/B])
Cells([B][COLOR=#ff0000]2[/COLOR][/B], .Item(p) + 6) = Split(p, ",")([COLOR=#ff0000][B]1[/B][/COLOR])
 
Upvote 0
Well sleuthed!

Here's some VBA code you can try that uses SQL through ADO.

To use this, you'll need to add a reference in the VBAProject to the ADO library.
To do that, from the VB Editor menubar > Tools > References... > check "Microsoft ActiveX Data Objects 2.8 Library" > OK.


Code:
Sub TransformArray()
    Dim cnn As ADODB.Connection
    Dim rst As ADODB.Recordset
    Dim rResults As Range
    Dim sSql As String, sDataAddress As String
    Dim i As Long, lR As Long, lC As Long
    Dim vArray, vResults, vHeader
    
    With Sheets("Sheet1")
        sDataAddress = .Name & "$A1:D3" & _
            .Cells(.Rows.Count, "A").End(xlUp).Row
        
        '---destination of results of query
        Set rResults = .Range("F1")
    End With

    sSql = Join$(Array( _
        "TRANSFORM MAX(Result)", _
        "SELECT [Chemical Name]", _
        "FROM [" & sDataAddress & "]", _
        "GROUP BY [Chemical Name]", _
        "PIVOT [Sample ID] & '|' & [Sample Date]" _
        ), vbCr)
    
    Set cnn = New ADODB.Connection
    Set rst = New ADODB.Recordset
    
    With cnn
      .Provider = "Microsoft.Jet.OLEDB.4.0"
      .Properties("Extended Properties").Value = "Excel 8.0;IMEX=1"
      .Open ActiveWorkbook.FullName
    End With
         
    rst.CursorLocation = adUseServer
    rst.Open Source:=sSql, _
        ActiveConnection:=cnn, _
        CursorType:=adOpenStatic, _
        LockType:=adLockReadOnly, _
        Options:=adCmdText

    ReDim vResults(1 To rst.RecordCount + 2, 1 To rst.Fields.Count)
    
    vArray = rst.GetRows(rst.RecordCount)
    
    '--transfer headers to results array
    For lC = 0 To rst.Fields.Count - 1
        vHeader = Split(rst.Fields(lC).Name, "|")
        For i = 0 To UBound(vHeader)
            vResults(i + 1, lC + 1) = Split(rst.Fields(lC).Name, "|")(i)
        Next i
    Next lC
        
    '--transfer records to results array
    For lR = 0 To rst.RecordCount - 1
        For lC = 0 To rst.Fields.Count - 1
            vResults(lR + 3, lC + 1) = vArray(lC, lR)
        Next lC
    Next lR
    
    rResults.Resize(UBound(vResults, 1), _
        UBound(vResults, 2)) = vResults
    
    rst.Close
    cnn.Close
    Set rst = Nothing
    Set cnn = Nothing

End Sub

The code could be modified to place the compounds as the column headers if needed.
 
Upvote 0

Forum statistics

Threads
1,215,331
Messages
6,124,311
Members
449,152
Latest member
PressEscape

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top