Link to home
Start Free TrialLog in
Avatar of srauler58
srauler58Flag for United States of America

asked on

Transform Excel Horizontal Data to Vertical Data for Pivoting

I need a VBA script to transform a large Excel spreadsheet from a horizontal, easy-to-read format to a vertical, easy-to-pivot format. I found an example on Bill Jelen's MrExcel site (Google "Convert Horizontal Data to Vertical Data Using a Macro for Optimal Excel Pivot Table Analysis"), but it's for a much simpler dataset and I couldn't get it to scale.

The data export I'm working from is a listing of task labor estimates that can cover anywhere from a couple of months to 3 years or more. Each task can have one or more subtasks, each with it's own labor estimate. As can be seen in the sample dataset, the period of performance for a particular task may be a contiguous series of months or could be spread over many months with some months of inactivity located throughout.  Note that my actual dataset has 18 columns before the Total Hours column instead of 5 and about 1000 rows, but I trimmed it to keep the sample cleaner.

During peak operations, my team has to pull an export from the Oracle-based repository on a daily basis and report updated estimates ordered by Department. Ideally we can use a pivot table to produce reports and charts for each department and summary data for the entire organization.  Unfortunately, the Oracle system outputs the export file as an Excel file that's formatted for reading, not optimized for pivoting.  I need a clean columnar list of data for pivoting (which I would also use as input to an Access database for custom reports).

The SampleData1.xls file attached here shows the original ExportData on the first tab, the desired format on the Transformed tab, and a pivot table on the Pivot tab. Given the popularity of pivot tables and the way most Excel worksheets are formatted, I'm sure this problem has been solved before, but I couldn't find a solution in the archives here that wasn't Access specific. I sometimes receive Excel files from customers or vendors that are also set up for reading, so what I would like to build is a macro that allows my team to easily change a few key parameters and use it on the other datasets as well.

Any pointers would be greatly appreciated, or if you know of an existing solution you can point me to, that would be great as well.  I'm working on it now, but my coding skills don't tend to produce real elegant solutions.

Avatar of calacuccia
Flag of Belgium image

Link to home
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
have the ExportData sheet active and run this macro on it

Sub Traverse()

    Dim Export As Worksheet
    Set Export = ActiveSheet
    Dim Estimates As Range
    Range(Selection, Selection.End(xlToRight)).Select
    Range(Selection, Selection.End(xlDown)).Select
    Set Estimates = Selection
    Dim ResultSheet As Worksheet
    Set ResultSheet = Worksheets.Add()
    ResultSheet.Name = "OutputSheet"
    Range("F1").Value = "Year"
    Range("G1").Value = "Month"
    Range("H1").Value = "Hours"
    Dim i As Integer ' loop iterator
    Dim r As Integer ' record#
    Let r = 1
    Dim h As Integer ' horizontal index
    Dim v As Integer ' vertical index
    For v = 0 To Estimates.Rows.Count - 1
        For h = 0 To Estimates.Columns.Count - 1
            With Export
                ' do not include zero values
                If .Cells(Estimates.Row + v, Estimates.Column + h) > 0 Then
                    r = r + 1
                    ' duplicate the leftmost 5 columns
                    For i = 1 To 5
                        ResultSheet.Cells(r, i) = .Cells(Estimates.Row + v, i)
                    Next i
                    ' supply the year/month
                    ResultSheet.Cells(r, 6) = right(.Cells(1, 9 + h), 4)
                    ResultSheet.Cells(r, 7) = left(.Cells(1, 9 + h), 3)
                    ' migrate the data value
                    ResultSheet.Cells(r, 8) = .Cells(Estimates.Row + v, 9 + h)
                End If
            End With
        Next h
    Next v

End Sub

Open in new window

Avatar of srauler58


Calacuccia & bhwithun:
Thanks to both of you for your time and effort.  I ran both of the solutions against separate test data files and both approaches appeared sound.  Calacuccia's script came up with a totally correct solution on the abbreviated dataset and bhwithun's had everything correct except, for some reason, the headings of the last 3 columns ended up being "Total Hours", "Jan 2011" & "Feb 2011" instead of the values the code specified, while the ExportData sheet's columns had the new values.

When I run Calacuccia's script against my full dataset, I get a "Subscript out of range (Error 9)" message and the script halts when the "Set ExportS = wb.Worksheets("ExportData")" is executed. I got this same result against the abbreviated dataset when I changed the "Set wb = ActiveWorkbook" statement to "Set wb = ThisWorkbook" , but the original statement worked fine on the small set.  Against the full dataset, however, it halts every time. Could I be running up against a dataset size limit?  The full dataset has 67 columns (18 before the Total Hours column) and 727 total rows. That's 48,709 cells... would that overflow an array limitation of some sort? My full dataset will grow larger in the coming weeks to possibly more than 1000 rows.

I tried running the Traverse() script against the full dataset as well, but I must have messed up when changing the key parameters and, while I didn't get a "Subscript out of range" error, the resulting format was pretty messed up.  I'll have to study that one tomorrow when I'm not so tired.

Thanks again to both of you for your time!  Any light you can shed on the "Subscript out of range" error would be greatly appreciated.

- Steve R.
Hi Steve,

The Subscript out of Range has nothing to do with size limit, it is because the worksheet name is not correct.

You should change  "ExportData" to the name of the worksheet containing the full dataset.
Or to simplify you might use. The worksheet with the dataset must be selected for ActiveSheet to recongnize the right sheet.

Set ExportS = wb.ActiveSheet

Try this. This would create the Pivot table as well.

Note: Always create a backup copy of your file before try this code.

I assume you have 3 sheets in your workbook "ExportData","Transformed","Pivot"

Sub kTest()
    'Author :   Krishnakumar
    'Ref    :

    Dim wksSource   As Worksheet
    Dim wksDest     As Worksheet
    Dim wksPivot    As Worksheet
    Dim rngPivot    As Range
    Dim arrOutput()
    Dim k
    Dim i           As Long
    Dim c           As Long
    Dim n           As Long
    Dim j           As Long
    Dim Hdr
    Dim ptPivot     As PivotTable
    Dim ptCache     As PivotCache
    Const PivotTableName    As String = "MyPivotTable"
    Set wksSource = Sheets("ExportData")
    Set wksDest = Sheets("Transformed")
    Set wksPivot = Sheets("Pivot")
    k = wksSource.UsedRange 'Source Data
    Hdr = Array("Task", "Subtask", "LineItem", "Dept", "Job Task Title", "Year", "Month", "Hours")
    ReDim arrOutput(1 To UBound(k, 1) * (UBound(k, 2) - 5) + 1, 1 To 8)
    For i = 2 To UBound(k, 1)
        n = n + 1
        For j = 1 To 5
            arrOutput(n, j) = k(i, j)
        arrOutput(n, 6) = Split(k(1, 7), " ")(1)
        arrOutput(n, 7) = Split(k(1, 7), " ")(0)
        arrOutput(n, 8) = k(i, 7)
        For c = 8 To UBound(k, 2)
            n = n + 1
            For j = 1 To 5
                arrOutput(n, j) = k(i, j)
            arrOutput(n, 6) = Split(k(1, c), " ")(1)
            arrOutput(n, 7) = Split(k(1, c), " ")(0)
            arrOutput(n, 8) = k(i, c)
    If n Then
        With wksDest
            .[a1].Resize(, UBound(Hdr) + 1).Value = Hdr
            .[a2].Resize(n, UBound(Hdr) + 1).Value = arrOutput
            Set rngPivot = .UsedRange.Resize(, 8)
        End With
        On Error Resume Next
        Set ptPivot = wksPivot.PivotTables(PivotTableName)
        On Error GoTo 0
        If ptPivot Is Nothing Then
            Set ptCache = ThisWorkbook.PivotCaches.Add(xlDatabase, rngPivot)
            Set ptPivot = ptCache.CreatePivotTable(wksPivot.Range("a3"), PivotTableName)
            With ptPivot
                .AddFields Array("Dept", "LineItem"), Array("Year", "Month")
                .AddDataField .PivotFields("Hours"), "Total Hours", xlSum
                .PivotSelect "Dept[All;Total]", xlDataAndLabel, True
                .PivotFields("Dept").Subtotals = Array( _
                        False, False, False, False, False, False, False, False, False, False, False, False)
                .PivotFields("Dept").LayoutForm = xlTabular
            End With
            Set ptCache = ThisWorkbook.PivotCaches.Add(xlDatabase, rngPivot)
            ptPivot.ChangePivotCache ptCache
        End If
    End If
    Set wksSource = Nothing
    Set wksDest = Nothing
    Set wksPivot = Nothing
    Set rngPivot = Nothing
    Set ptPivot = Nothing
    Set ptCache = Nothing
    Erase k
End Sub

Open in new window


Doh!  My mistake... too many data files open at once with varying naming conventions!  I had read that the error was related to worksheet name, but didn't catch which file I had run the script against last.  I'm accepting your solution with great appreciation.

Krishnakrkc: Thanks for your solution as well.  I haven't done any VBA to create pivot tables, so your code will be interesting to study.

- Steve R.