Link to home
Start Free TrialLog in
Avatar of R Davignon
R DavignonFlag for United States of America

asked on

Cleanse exported data and summarize and group by specific field

I have taskcard program that outputs work accomplished at an auto repair facility and tracks by a specific workcard.  Unfortunately, I need data rolled up to a higher level.  Output is sorted by column B - Car Nbr and then by Column G - Workcard.  I would like to generate a new tab in each spreadsheet that rolls up similar workcards into 1 unique number for each car and then repeated for the next car and any labor hours in columns D and J would be summed together

Pattern Examples:
  • 2308
  • 2308
- These two unique lines would only show up as 1 entry in the output tab with any labor hours for both lines summed together

  • 0018_ZIP01
  • 0018_ZIP01
-These items would show up as one entry "0018" with the the "_xxx" removed and the hours summed together

  • 1855/001
  • 1855/001
-These items would show up as one entry "1855" with the the "/001" removed and the hours summed together

  • 21-97-01
  • 21-97-01
  • 21-97-01/001
-These would show up as one entry "21-97-01" with the "/001" removed and hours summed together

  • C1234AB_0001
  • C1234AB_0002
  • C1234AB_0003
  • C1234AB_0004
-These would show up as one entry "C1234AB" with the "_XXXX" removed and hours summed together

See attached sheet where I show the input and the desired output.

Thank youData-Prep.xlsx
Avatar of aikimark
aikimark
Flag of United States of America image

I started to get the base data of the work code with this function, creating a new column with the formula:
=GetBaseData(G3)

Open in new window

Public Function GetBaseData(ByVal parmText) As String
    Static oRE As Object
    Dim oMatches As Object
    If oRE Is Nothing Then
        Set oRE = CreateObject("vbscript.regexp")
        oRE.Global = False
        oRE.Pattern = "^([^/_]+)"
    End If
    Set oMatches = oRE.Execute(parmText)
    GetBaseData = oMatches(0).submatches(0)
End Function

Open in new window

While that worked well, the mapping of the pivot table results against what you needed ran into some snags when you have multiple different values for the same base work code.
For instance, base work code 1855 has two different values for Data Plate and Description.  In your example, there doesn't appear to be a standard rule for what happens in such circumstances.
Avatar of R Davignon

ASKER

Data Plate is a random value that has no bearing on the output (could be ignored).  Do I place the formula in each cell and copy down:
=GetBaseData(G3)   =GetBaseData(G4) = =GetBaseData(G5)
ASKER CERTIFIED SOLUTION
Avatar of aikimark
aikimark
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
The outputfound in Sheet 1 looks great.  I think that will work.  How do I run the macro or start the code?  I appreciate your help.
1. Do you need to preserve the original data?
2. Is this a one-time process?
I would like to preserve the original data for reference and it would be a one time process for every new spreadsheet.  Receive a new one every week.
yes.  the =getbasedata() formula is in every row in a new column.  The parameter should be the work code column on that row.
This comes close to automating the necessary steps.
Public Sub Q_29093577()
'---------------------------------------------------------------------------------------
' Method : Q_29093577
' Author : Aikimark
' Date   : 4/10/2018
' Purpose: Cleanse exported data and summarize and group by specific field
'---------------------------------------------------------------------------------------
    Dim wksSrc As Worksheet
    Dim wksTgt As Worksheet
    Dim rngSrc As Range
    Dim rngTgt As Range
    
    Application.ScreenUpdating = False
    
    Set wksSrc = Worksheets("DataInput")
    Set wksTgt = Sheets.Add(After:=Sheets(Sheets.Count))
    Set rngTgt = wksTgt.Range("A1")
    Set rngSrc = wksSrc.Range("A1").CurrentRegion
    rngSrc.Copy rngTgt
    
    wksTgt.Rows(1).Delete
    
    wksTgt.Columns("H:H").Insert
    'Selection.Insert Shift:=xlToRight, CopyOrigin:=xlFormatFromLeftOrAbove
    wksTgt.Range("H1").FormulaR1C1 = "WkCardBase"
    Set rngTgt = wksTgt.Range("H2")
    rngTgt.FormulaR1C1 = "=GetBaseData(RC[-1])"
    rngTgt.AutoFill Destination:=wksTgt.Range(wksTgt.Range("H2"), wksTgt.Range("I2").End(xlDown).Offset(0, -1))
    
    rngTgt.Subtotal GroupBy:=9, Function:=xlSum, TotalList:=Array(4, 11), _
        Replace:=True, PageBreaks:=False, SummaryBelowData:=True
    rngTgt.ClearOutline
    wksTgt.Range("I:I").Find(What:="Grand Total").Activate
    Selection.EntireRow.Delete
    
    Set rngSrc = wksTgt.Range("A1").End(xlDown).Offset(1)
    rngSrc.FormulaR1C1 = "=R[-1]C"
    Set rngTgt = rngTgt.CurrentRegion
    Set rngTgt = rngTgt.SpecialCells(xlCellTypeBlanks)
    rngSrc.Copy rngTgt

    Set rngTgt = rngTgt.CurrentRegion
    rngTgt.Value = rngTgt.Value
    
    wksTgt.Range("$A$1").CurrentRegion.AutoFilter Field:=9, Criteria1:="<>*Total*" _
        , Operator:=xlAnd
    
    wksTgt.Range(rngTgt.Rows(2), rngTgt.SpecialCells(xlCellTypeLastCell)).SpecialCells(xlCellTypeVisible).EntireRow.Delete
    
    wksTgt.Range("$A$1").CurrentRegion.AutoFilter
    
    Set rngTgt = wksTgt.Range(wksTgt.Range("I1"), wksTgt.Range("I1").End(xlDown))
    rngTgt.Replace What:=" Total", Replacement:="", LookAt:=xlPart, _
        SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
        ReplaceFormat:=False
    
    Set rngTgt = rngTgt.CurrentRegion
    rngTgt.ClearFormats
    
    Application.ScreenUpdating = True

End Sub

Open in new window

I ran it on the sample data input and it all worked except I noted the following:

25-93-05
25-93-05

showed up twice in the output
That's what I mentioned in the earlier comment about multiple values for a given WorkCodeBase.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thank you for all your help on this.
Ability to provide the error cleansing and extract with such a limited set of lines of code.