Cleanse exported data and summarize and group by specific field

I have taskcard program that outputs work accomplished at an auto repair facility and tracks by a specific workcard.  Unfortunately, I need data rolled up to a higher level.  Output is sorted by column B - Car Nbr and then by Column G - Workcard.  I would like to generate a new tab in each spreadsheet that rolls up similar workcards into 1 unique number for each car and then repeated for the next car and any labor hours in columns D and J would be summed together

Pattern Examples:
  • 2308
  • 2308
- These two unique lines would only show up as 1 entry in the output tab with any labor hours for both lines summed together

  • 0018_ZIP01
  • 0018_ZIP01
-These items would show up as one entry "0018" with the the "_xxx" removed and the hours summed together

  • 1855/001
  • 1855/001
-These items would show up as one entry "1855" with the the "/001" removed and the hours summed together

  • 21-97-01
  • 21-97-01
  • 21-97-01/001
-These would show up as one entry "21-97-01" with the "/001" removed and hours summed together

  • C1234AB_0001
  • C1234AB_0002
  • C1234AB_0003
  • C1234AB_0004
-These would show up as one entry "C1234AB" with the "_XXXX" removed and hours summed together

See attached sheet where I show the input and the desired output.

Thank youData-Prep.xlsx
robbdfwAsked:
Who is Participating?
 
aikimarkConnect With a Mentor Commented:
This comes close.  There is some difference due to missing group E.  How does it look to you?
Data-Prep.xlsm
0
 
aikimarkCommented:
I started to get the base data of the work code with this function, creating a new column with the formula:
=GetBaseData(G3)

Open in new window

Public Function GetBaseData(ByVal parmText) As String
    Static oRE As Object
    Dim oMatches As Object
    If oRE Is Nothing Then
        Set oRE = CreateObject("vbscript.regexp")
        oRE.Global = False
        oRE.Pattern = "^([^/_]+)"
    End If
    Set oMatches = oRE.Execute(parmText)
    GetBaseData = oMatches(0).submatches(0)
End Function

Open in new window

While that worked well, the mapping of the pivot table results against what you needed ran into some snags when you have multiple different values for the same base work code.
For instance, base work code 1855 has two different values for Data Plate and Description.  In your example, there doesn't appear to be a standard rule for what happens in such circumstances.
0
 
robbdfwAuthor Commented:
Data Plate is a random value that has no bearing on the output (could be ignored).  Do I place the formula in each cell and copy down:
=GetBaseData(G3)   =GetBaseData(G4) = =GetBaseData(G5)
0
Cloud Class® Course: CompTIA Healthcare IT Tech

This course will help prep you to earn the CompTIA Healthcare IT Technician certification showing that you have the knowledge and skills needed to succeed in installing, managing, and troubleshooting IT systems in medical and clinical settings.

 
robbdfwAuthor Commented:
The outputfound in Sheet 1 looks great.  I think that will work.  How do I run the macro or start the code?  I appreciate your help.
0
 
aikimarkCommented:
1. Do you need to preserve the original data?
2. Is this a one-time process?
0
 
robbdfwAuthor Commented:
I would like to preserve the original data for reference and it would be a one time process for every new spreadsheet.  Receive a new one every week.
0
 
aikimarkCommented:
yes.  the =getbasedata() formula is in every row in a new column.  The parameter should be the work code column on that row.
0
 
aikimarkCommented:
This comes close to automating the necessary steps.
Public Sub Q_29093577()
'---------------------------------------------------------------------------------------
' Method : Q_29093577
' Author : Aikimark
' Date   : 4/10/2018
' Purpose: Cleanse exported data and summarize and group by specific field
'---------------------------------------------------------------------------------------
    Dim wksSrc As Worksheet
    Dim wksTgt As Worksheet
    Dim rngSrc As Range
    Dim rngTgt As Range
    
    Application.ScreenUpdating = False
    
    Set wksSrc = Worksheets("DataInput")
    Set wksTgt = Sheets.Add(After:=Sheets(Sheets.Count))
    Set rngTgt = wksTgt.Range("A1")
    Set rngSrc = wksSrc.Range("A1").CurrentRegion
    rngSrc.Copy rngTgt
    
    wksTgt.Rows(1).Delete
    
    wksTgt.Columns("H:H").Insert
    'Selection.Insert Shift:=xlToRight, CopyOrigin:=xlFormatFromLeftOrAbove
    wksTgt.Range("H1").FormulaR1C1 = "WkCardBase"
    Set rngTgt = wksTgt.Range("H2")
    rngTgt.FormulaR1C1 = "=GetBaseData(RC[-1])"
    rngTgt.AutoFill Destination:=wksTgt.Range(wksTgt.Range("H2"), wksTgt.Range("I2").End(xlDown).Offset(0, -1))
    
    rngTgt.Subtotal GroupBy:=9, Function:=xlSum, TotalList:=Array(4, 11), _
        Replace:=True, PageBreaks:=False, SummaryBelowData:=True
    rngTgt.ClearOutline
    wksTgt.Range("I:I").Find(What:="Grand Total").Activate
    Selection.EntireRow.Delete
    
    Set rngSrc = wksTgt.Range("A1").End(xlDown).Offset(1)
    rngSrc.FormulaR1C1 = "=R[-1]C"
    Set rngTgt = rngTgt.CurrentRegion
    Set rngTgt = rngTgt.SpecialCells(xlCellTypeBlanks)
    rngSrc.Copy rngTgt

    Set rngTgt = rngTgt.CurrentRegion
    rngTgt.Value = rngTgt.Value
    
    wksTgt.Range("$A$1").CurrentRegion.AutoFilter Field:=9, Criteria1:="<>*Total*" _
        , Operator:=xlAnd
    
    wksTgt.Range(rngTgt.Rows(2), rngTgt.SpecialCells(xlCellTypeLastCell)).SpecialCells(xlCellTypeVisible).EntireRow.Delete
    
    wksTgt.Range("$A$1").CurrentRegion.AutoFilter
    
    Set rngTgt = wksTgt.Range(wksTgt.Range("I1"), wksTgt.Range("I1").End(xlDown))
    rngTgt.Replace What:=" Total", Replacement:="", LookAt:=xlPart, _
        SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
        ReplaceFormat:=False
    
    Set rngTgt = rngTgt.CurrentRegion
    rngTgt.ClearFormats
    
    Application.ScreenUpdating = True

End Sub

Open in new window

0
 
robbdfwAuthor Commented:
I ran it on the sample data input and it all worked except I noted the following:

25-93-05
25-93-05

showed up twice in the output
0
 
aikimarkCommented:
That's what I mentioned in the earlier comment about multiple values for a given WorkCodeBase.
0
 
aikimarkConnect With a Mentor Commented:
If you need to sum SOLELY by the (base) work code, then you'll need to sort by the original workcode column before invoking the VBA code.  Alternatively, you can tweak the VBA code to do the sort.
0
 
robbdfwAuthor Commented:
Thank you for all your help on this.
0
 
robbdfwAuthor Commented:
Ability to provide the error cleansing and extract with such a limited set of lines of code.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.