Solved

Transform Excel Horizontal Data to Vertical Data for Pivoting

Posted on 2010-09-03
6
933 Views
Last Modified: 2012-05-10
I need a VBA script to transform a large Excel spreadsheet from a horizontal, easy-to-read format to a vertical, easy-to-pivot format. I found an example on Bill Jelen's MrExcel site (Google "Convert Horizontal Data to Vertical Data Using a Macro for Optimal Excel Pivot Table Analysis"), but it's for a much simpler dataset and I couldn't get it to scale.

The data export I'm working from is a listing of task labor estimates that can cover anywhere from a couple of months to 3 years or more. Each task can have one or more subtasks, each with it's own labor estimate. As can be seen in the sample dataset, the period of performance for a particular task may be a contiguous series of months or could be spread over many months with some months of inactivity located throughout.  Note that my actual dataset has 18 columns before the Total Hours column instead of 5 and about 1000 rows, but I trimmed it to keep the sample cleaner.

During peak operations, my team has to pull an export from the Oracle-based repository on a daily basis and report updated estimates ordered by Department. Ideally we can use a pivot table to produce reports and charts for each department and summary data for the entire organization.  Unfortunately, the Oracle system outputs the export file as an Excel file that's formatted for reading, not optimized for pivoting.  I need a clean columnar list of data for pivoting (which I would also use as input to an Access database for custom reports).

The SampleData1.xls file attached here shows the original ExportData on the first tab, the desired format on the Transformed tab, and a pivot table on the Pivot tab. Given the popularity of pivot tables and the way most Excel worksheets are formatted, I'm sure this problem has been solved before, but I couldn't find a solution in the archives here that wasn't Access specific. I sometimes receive Excel files from customers or vendors that are also set up for reading, so what I would like to build is a macro that allows my team to easily change a few key parameters and use it on the other datasets as well.

Any pointers would be greatly appreciated, or if you know of an existing solution you can point me to, that would be great as well.  I'm working on it now, but my coding skills don't tend to produce real elegant solutions.

SampleData1.xls
0
Comment
Question by:srauler58
6 Comments
 
LVL 17

Accepted Solution

by:
calacuccia earned 250 total points
ID: 33600965
This should do the job, have a look at the comments in the VBA, some explanation is provided, if you need more, feel free to ask :-)


Sub TransformForPivotTable()

Dim wb As Workbook

Dim ExportS As Worksheet

Dim TranS As Worksheet

Dim lRow As Long

Dim fixedR As Range, TransR As Range



Application.ScreenUpdating = False

Set wb = ActiveWorkbook 'Set up to run from another workbook, activate the Export workbook first

                        'If not ok, you can also use --> Set wb = ThisWorkbook or Set wb = Workbooks("myName.xls")

Set ExportS = wb.Worksheets("ExportData")

Set TranS = wb.Worksheets.Add

TranS.Name = "Transformed by VBA"

lRow = ExportS.Range("A" & ExportS.Rows.Count).End(xlUp).Row

'This to be changed to the columns which are to be repeated, in this case, 5 columns (A to E)

Set fixedR = ExportS.Range("A2:E" & lRow)

'This is the range containing the data to be reorganized: in this example starting from column G --> Jan 2011

Set TransR = ExportS.Range(ExportS.Range("G2"), ExportS.Cells(2, ExportS.Columns.Count).End(xlToLeft).Offset(lRow - 2, 0))

'Copy headers and create new headers Year-Month-Hours

fixedR.Resize(1, fixedR.Columns.Count).Offset(-1, 0).Copy

TranS.Paste TranS.Range("A1")

TranS.Cells(1, fixedR.Columns.Count + 1) = "Year"

TranS.Cells(1, fixedR.Columns.Count + 2) = "Month"

TranS.Cells(1, fixedR.Columns.Count + 3) = "Hours"

TranS.Cells(1, 1).Copy

TranS.Cells(1, fixedR.Columns.Count + 1).Resize(1, 3).PasteSpecial xlPasteFormats

'Loop through each row, and write the results to the Tranformed worksheet

k = 2

For i = 1 To lRow - 1

    For j = 1 To TransR.Columns.Count

        If TransR.Cells(i, j) <> 0 Then

            fixedR.Rows(i).Copy

            TranS.Paste TranS.Rows(k)

            TranS.Cells(k, fixedR.Columns.Count + 1) = Right(ExportS.Cells(1, TransR.Cells(i, j).Column), 4)

            TranS.Cells(k, fixedR.Columns.Count + 2) = Left(ExportS.Cells(1, TransR.Cells(i, j).Column), 3)

            TranS.Cells(k, fixedR.Columns.Count + 3) = TransR.Cells(i, j)

            k = k + 1

        End If

    Next j

Next i

Application.ScreenUpdating = False

TranS.Columns("A:H").AutoFit

End Sub

Open in new window

0
 
LVL 13

Expert Comment

by:Brian Withun
ID: 33601110
have the ExportData sheet active and run this macro on it


Sub Traverse()



    Dim Export As Worksheet

    Set Export = ActiveSheet

    

    Dim Estimates As Range

    ActiveSheet.Range("I2").Select

    Range(Selection, Selection.End(xlToRight)).Select

    Range(Selection, Selection.End(xlDown)).Select

    

    Set Estimates = Selection

    

    Dim ResultSheet As Worksheet

    Set ResultSheet = Worksheets.Add()

    

    ResultSheet.Name = "OutputSheet"

    

    Export.Activate

    Range("A1:H1").Select

    Selection.Copy

    ResultSheet.Paste

    Range("F1").Value = "Year"

    Range("G1").Value = "Month"

    Range("H1").Value = "Hours"

    

    Dim i As Integer ' loop iterator

    Dim r As Integer ' record#

    Let r = 1

    

    Dim h As Integer ' horizontal index

    Dim v As Integer ' vertical index

    

    For v = 0 To Estimates.Rows.Count - 1

        For h = 0 To Estimates.Columns.Count - 1

            With Export

                ' do not include zero values

                If .Cells(Estimates.Row + v, Estimates.Column + h) > 0 Then

                

                    r = r + 1

                    

                    ' duplicate the leftmost 5 columns

                    For i = 1 To 5

                        ResultSheet.Cells(r, i) = .Cells(Estimates.Row + v, i)

                    Next i

                    

                    ' supply the year/month

                    ResultSheet.Cells(r, 6) = right(.Cells(1, 9 + h), 4)

                    ResultSheet.Cells(r, 7) = left(.Cells(1, 9 + h), 3)

                    

                    ' migrate the data value

                    ResultSheet.Cells(r, 8) = .Cells(Estimates.Row + v, 9 + h)

                

                End If

            End With

        Next h

    Next v



End Sub

Open in new window

0
 

Author Comment

by:srauler58
ID: 33602052
Calacuccia & bhwithun:
Thanks to both of you for your time and effort.  I ran both of the solutions against separate test data files and both approaches appeared sound.  Calacuccia's script came up with a totally correct solution on the abbreviated dataset and bhwithun's had everything correct except, for some reason, the headings of the last 3 columns ended up being "Total Hours", "Jan 2011" & "Feb 2011" instead of the values the code specified, while the ExportData sheet's columns had the new values.

When I run Calacuccia's script against my full dataset, I get a "Subscript out of range (Error 9)" message and the script halts when the "Set ExportS = wb.Worksheets("ExportData")" is executed. I got this same result against the abbreviated dataset when I changed the "Set wb = ActiveWorkbook" statement to "Set wb = ThisWorkbook" , but the original statement worked fine on the small set.  Against the full dataset, however, it halts every time. Could I be running up against a dataset size limit?  The full dataset has 67 columns (18 before the Total Hours column) and 727 total rows. That's 48,709 cells... would that overflow an array limitation of some sort? My full dataset will grow larger in the coming weeks to possibly more than 1000 rows.

I tried running the Traverse() script against the full dataset as well, but I must have messed up when changing the key parameters and, while I didn't get a "Subscript out of range" error, the resulting format was pretty messed up.  I'll have to study that one tomorrow when I'm not so tired.

Thanks again to both of you for your time!  Any light you can shed on the "Subscript out of range" error would be greatly appreciated.

- Steve R.
0
Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

 
LVL 17

Expert Comment

by:calacuccia
ID: 33602250
Hi Steve,

The Subscript out of Range has nothing to do with size limit, it is because the worksheet name is not correct.

You should change  "ExportData" to the name of the worksheet containing the full dataset.
Or to simplify you might use. The worksheet with the dataset must be selected for ActiveSheet to recongnize the right sheet.

Set ExportS = wb.ActiveSheet
0
 
LVL 18

Expert Comment

by:krishnakrkc
ID: 33602404
Hi,

Try this. This would create the Pivot table as well.

Note: Always create a backup copy of your file before try this code.

I assume you have 3 sheets in your workbook "ExportData","Transformed","Pivot"

Kris
Sub kTest()
    'Author :   Krishnakumar
    'Ref    :   http://www.experts-exchange.com/Software/Office_Productivity/Office_Suites/MS_Office/Excel/Q_26451108.html

    Dim wksSource   As Worksheet
    Dim wksDest     As Worksheet
    Dim wksPivot    As Worksheet
    Dim rngPivot    As Range
    Dim arrOutput()
    Dim k
    Dim i           As Long
    Dim c           As Long
    Dim n           As Long
    Dim j           As Long
    Dim Hdr
    Dim ptPivot     As PivotTable
    Dim ptCache     As PivotCache
    
    Const PivotTableName    As String = "MyPivotTable"
    
    Set wksSource = Sheets("ExportData")
    Set wksDest = Sheets("Transformed")
    Set wksPivot = Sheets("Pivot")
    
    k = wksSource.UsedRange 'Source Data
    
    Hdr = Array("Task", "Subtask", "LineItem", "Dept", "Job Task Title", "Year", "Month", "Hours")
    ReDim arrOutput(1 To UBound(k, 1) * (UBound(k, 2) - 5) + 1, 1 To 8)
    
    For i = 2 To UBound(k, 1)
        n = n + 1
        For j = 1 To 5
            arrOutput(n, j) = k(i, j)
        Next
        arrOutput(n, 6) = Split(k(1, 7), " ")(1)
        arrOutput(n, 7) = Split(k(1, 7), " ")(0)
        arrOutput(n, 8) = k(i, 7)
        For c = 8 To UBound(k, 2)
            n = n + 1
            For j = 1 To 5
                arrOutput(n, j) = k(i, j)
            Next
            arrOutput(n, 6) = Split(k(1, c), " ")(1)
            arrOutput(n, 7) = Split(k(1, c), " ")(0)
            arrOutput(n, 8) = k(i, c)
        Next
    Next
    If n Then
        With wksDest
            .UsedRange.Clear
            .[a1].Resize(, UBound(Hdr) + 1).Value = Hdr
            .[a2].Resize(n, UBound(Hdr) + 1).Value = arrOutput
            Set rngPivot = .UsedRange.Resize(, 8)
        End With
        
        On Error Resume Next
        Set ptPivot = wksPivot.PivotTables(PivotTableName)
        On Error GoTo 0
        
        If ptPivot Is Nothing Then
            wksPivot.UsedRange.Clear
            Set ptCache = ThisWorkbook.PivotCaches.Add(xlDatabase, rngPivot)
            Set ptPivot = ptCache.CreatePivotTable(wksPivot.Range("a3"), PivotTableName)
                
            With ptPivot
                .AddFields Array("Dept", "LineItem"), Array("Year", "Month")
                .AddDataField .PivotFields("Hours"), "Total Hours", xlSum
                .PivotSelect "Dept[All;Total]", xlDataAndLabel, True
                .PivotFields("Dept").Subtotals = Array( _
                        False, False, False, False, False, False, False, False, False, False, False, False)
                .PivotFields("Dept").LayoutForm = xlTabular
            End With
        Else
            Set ptCache = ThisWorkbook.PivotCaches.Add(xlDatabase, rngPivot)
            ptPivot.ChangePivotCache ptCache
            ptPivot.RefreshTable
        End If
    End If
    
    Set wksSource = Nothing
    Set wksDest = Nothing
    Set wksPivot = Nothing
    Set rngPivot = Nothing
    Set ptPivot = Nothing
    Set ptCache = Nothing
    Erase k
    
End Sub

Open in new window

0
 

Author Comment

by:srauler58
ID: 33603660
Calacuccia,

Doh!  My mistake... too many data files open at once with varying naming conventions!  I had read that the error was related to worksheet name, but didn't catch which file I had run the script against last.  I'm accepting your solution with great appreciation.

Krishnakrkc: Thanks for your solution as well.  I haven't done any VBA to create pivot tables, so your code will be interesting to study.

- Steve R.
0

Featured Post

Backup Your Microsoft Windows Server®

Backup all your Microsoft Windows Server – on-premises, in remote locations, in private and hybrid clouds. Your entire Windows Server will be backed up in one easy step with patented, block-level disk imaging. We achieve RTOs (recovery time objectives) as low as 15 seconds.

Join & Write a Comment

The canonical version of this article is on my web site here: http://iconoun.com/articles/collisions/ A companion presentation is available here: http://iconoun.com/articles/collisions/Unicode_Presentation.pdf
User Beware!  This is a rather permanent solution to removing your email from an exchange server.  The only way to truly go back is to have your exchange administrator restore your mailbox from backups.  This is usually the option of last resort.  A…
Graphs within dashboards are meant to be dynamic, representing data from a period of time that will change each time the dashboard is updated with new data. Rather than update each graph to point to a different set within a static set of data, t…
This Micro Tutorial will demonstrate in Google Sheets how to use the HYPERLINK function to create live links inside your spreadsheet.

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now