Macro to delete error rows while retaining/summing some deleted information

Posted on 2014-03-20
Last Modified: 2014-03-21

I need a macro to delete some rows while retaining and/or summing some of the information contained within specific columns of these rows. I have attached some Sample Data. As a starting point, for column F (labelled “T/C”), this only concerns rows where the entry in column F is “C” rather than “T”. I have attached a screenshot of the data (Sample Data Screenshot 1), and I will refer to rows 67 to 72 in Sheet1 of the sample data file as an example.

The problem can be seen in column M, “TimeSLC”. Each row beneath row 67 exists as a result of a data collection error. As can be seen, the value for rows 68-72 in column M is 4, related to the 4 minute gap between the final time and the initial time in columns D and B, respectively.

Using the example provided (rows 67-72), the following is required:

• Take the initial date and initial time from row 67, and combine this with the final date and final time in row 72. Delete the date and time entries between these, thereby remaining with one row with the correct date and time.
• For column I, ChTime, sum each of the values into this new row.
• Use the initial value in DistSLC (column L), and ignore (i.e. delete) the blanks below.
• Use the initial value of TimeSLC (column M), and ignore (i.e. delete) any values below.
• Use the value for SOCB in the first entry and SOCA in the final entry, and input these values into the new row, with the intermediate entries deleted.
• Use the values for columns R, S, and T (Loc1, Loc2 and Clim, respectively; see the attached sample data) from the first row of each occurrence.

I have attached a second screenshot (Sample Data Screenshot 2) with the desired outcome specific to these rows 67-72 (it must be noted that column I, ChTime, which is now equal to 305, does not exactly match the new amount of time between the final and initial times in the row, but this is not of concern).

The macro should stop summing and/or deleting the subsequent rows once either the value of column M (TimeSLC) is sufficiently large, such as greater than 60, or once the value of column F switches from “C” to “T” (in that order of preference; the value in column M should be checked before checking if the value of column F changed).

It must be noted that the value in column M (TimeSLC) is not necessarily 4 – it can be another small value, within the range of 0-59. Also, the dataset contains approximately 55,000 rows.

The problem always concerns when there are 2 or more values of “C” in a row; however, when there are multiple values of C in a row there is not always an error at play. Referring to Column M, if the second value is 60 or greater, then the entry is valid. There are two additional sample datasets on Sheet2 and Sheet 3 as examples.

Thanks for any help in making the macro.
Question by:weldo
  • 2
  • 2
LVL 20

Accepted Solution

ElrondCT earned 500 total points
ID: 39945377
I think this should do the trick for you:

Sub MergeData()
    Dim RowEnd As Integer
    Dim iLine As Integer
    Dim StartLine As Integer
    Dim MinTime As Integer
    Dim chTime As Integer
    MinTime = 60                                    ' Minimum valid time for column M
    Application.ScreenUpdating = False
    RowEnd = Range("A1").End(xlDown).Row            ' Find last row of data
    ' Through the worksheet in reverse order, so deleting lines doesn't mess up our position
    For iLine = RowEnd To 3 Step -1
        ' Look for two C lines, the lower of which has a time below the minimum
        If Cells(iLine, 6) = "C" And Cells(iLine - 1, 6) = "C" _
        And Cells(iLine, 13) < MinTime Then
            chTime = Cells(iLine, 9)
            StartLine = iLine - 1
            ' Go up until we find one that's a valid line
            While Cells(StartLine, 6) = "C" And Cells(StartLine, 13) < MinTime And StartLine > 1
                chTime = chTime + Cells(StartLine, 9)
                StartLine = StartLine - 1
            ' If StartLine is a T line, then move back down one
            If Cells(StartLine, 6) = "T" Then
                StartLine = StartLine + 1
                chTime = chTime - Cells(StartLine, 9)   ' Don't doublecount
            End If
            ' StartLine now has a valid line; merge the group of data
            Cells(StartLine, 9) = Cells(StartLine, 9) + chTime
            Cells(StartLine, 3) = Cells(iLine, 3)
            Cells(StartLine, 4) = Cells(iLine, 4)
            Cells(StartLine, 15) = Cells(iLine, 15)
            Rows(Trim(Str(StartLine + 1)) & ":" & Trim(Str(iLine))).Delete Shift:=xlUp
            iLine = StartLine                   ' Skip deleted lines
        End If
    Application.ScreenUpdating = True
End Sub

Open in new window

The easiest way to make changes where you're deleting rows is going in reverse, because otherwise it's hard to keep the row pointer in the proper place. I've made the minimum time a variable that you set up front, so if you decide to change the cutoff, you won't miss a reference. If you want to watch the deleted lines disappear, you can leave ScreenUpdating on, but it runs a lot more slowly.

Author Comment

ID: 39945617

Thanks a million for your help and the additional information, I tested the macro on the Sample Data and it worked exactly as desired.

However, when I checked to see if it worked on the overall database I received the Overflow error, "Run time error '6'".

It appears the error is in line 9 of the code:
RowEnd = Range("A1").End(xlDown).Row

It seems like it has located the correct row (row 55,453) but the error persists. I'm just wondering if you know what the problem might be?

Thanks again for your help!
LVL 20

Expert Comment

ID: 39945690
Try changing RowEnd, iLine, and StartLine from Integer to Long.

Author Comment

ID: 39945701
Yeah that worked perfectly, cheers again!

Featured Post

Best Practices: Disaster Recovery Testing

Besides backup, any IT division should have a disaster recovery plan. You will find a few tips below relating to the development of such a plan and to what issues one should pay special attention in the course of backup planning.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

You can of course define an array to hold data that is of a particular type like an array of Strings to hold customer names or an array of Doubles to hold customer sales, but what do you do if you want to coordinate that data? This article describes…
Excel can be a tricky bit of software to get your head around. Whilst you’ll be able to eventually get to grips with the basic understanding of how to get by, there are a few Excel tips that not everybody will even know about let alone know how to d…
The viewer will learn how to use a discrete random variable to simulate the return on an investment over a period of years, create a Monte Carlo simulation using the discrete random variable, and create a graph to represent the possible returns over…
Graphs within dashboards are meant to be dynamic, representing data from a period of time that will change each time the dashboard is updated with new data. Rather than update each graph to point to a different set within a static set of data, t…

809 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question