Link to home
Start Free TrialLog in
Avatar of Jerry L
Jerry LFlag for United States of America

asked on

Excel: Sort & Match Rows - Must Handle Two Sets of Data Stacked in One Column

I had this code written in this post:
https://www.experts-exchange.com/questions/25192346/Excel-Match-Rows-Macro-Not-Working.html
(See attached file for the code and sample data.)

Here's the previous specification
------------------------------------------

Sub MatchEm_9()
' Assumes that the two sets of data are in parallel columns. No data is to the right of the second set. _
     The two sets of data must start in the same row, but need not contain the same data types, or number of columns. _
     Nothing should be underneath either set of data. Blank row in the data set with the most rows terminates execution.
' The "key" must be the first column in each set of data
' The number of columns in each list is determined by looking to the right for a blank cell.
' Make sure all blank cells of the first rows are filled in with "dummy" values.


New specification to be as follows
--------------------------------------------

Sub MatchEm_10()
' The two sets of data must be separated by at least one blank row if Stacked one on top of the other
'      or by one blank column if the data sets are next to one another.
' Each of the two data sets may contain a different number of columns and rows.
' The Max number of columns in each set is determined by a Blank Cell in the first row of each set.
' User MUST make sure all blank cells of the first row of each data set are filled in with "dummy" values so all columns of data will be included before a blank cell is reached.
' Data Set 1 ends when a Blank Cell in it's first column is parsed.
' Data Set 2 ends when a Blank Cell in it's first column is parsed.
' Both data sets must be parsed completely.
' The program allows the user to select the first cell in the first row in each set of data.


The code should work even when one set of data is directly below the other as long as there is a blank row between them.

Below, in the code window, is the way the results should look, and it works fine when the second set of data starts in a completely different group of columns from the first set of data, as specified by the original spec. (Paste it into a sheet to see it formatted properly.)

In the attached file, Sheet1 gives the layout of data I want the code to work with.
Sheet2 shows the data in a format that works with the current macro, matchem_9()
The file also contains the current version of the code.

Try it first with set 1 as the first selection, and set 2 as the second selection.  But, you should be able to use any set of data (1, 2, or 3) as the first selection, and any other as the second selection. If set 2 is selected first, then either 1 or 3 could be the second set, for example.
data set 1 word 9	any data 9		Mismatch	data set 2 word 9	any data 9
data set 1 word 8	any data 8			data set 1 word 8	any data 8
data set 1 word 7	any data 7			data set 1 word 7	any data 7
data set 1 word 4	any data 4			data set 1 word 4	any data 4
data set 1 word 3	any data 3			data set 1 word 3	any data 3
data set 1 word 2	any data 2			data set 1 word 2	any data 2
data set 1 word 19	any data 19			data set 1 word 19	any data 19
data set 1 word 18	any data 18			data set 1 word 18	any data 18
data set 1 word 17	any data 17			data set 1 word 17	any data 17
data set 1 word 15	any data 15		Mismatch	data set 1 word 6	any data 6
data set 1 word 14	any data 14			data set 1 word 14	any data 14
data set 1 word 13	any data 13		Mismatch	data set 1 word 5	any data 5
data set 1 word 11	any data 11			data set 1 word 11	any data 11
data set 1 word 10	any data 10			data set 1 word 10	any data 10
data set 1 word 1	any data 1		Mismatch	data set 1 word 16	any data 16
			Mismatch	data set 1 word 12	any data 12

Open in new window

MACRO-10-Sort-and-Match-by-Rows.xls
Avatar of byundt
byundt
Flag of United States of America image

I apologize for taking so long to respond to this thread, but I've been on the road.

Excel VBA can find the last cell in a set by looking up from the bottom or by looking down from the top. The former is generally more reliable, but the latter is required by your latest specification.

Brad
Sub MatchEm_10()
' Assumes that the two sets of data are in parallel columns. No data is to the right of the second set. _
     The two sets of data must start in the same row, but need not contain the same data types, or number of columns. _
     Nothing should be underneath either set of data. Blank row in the data set with the most rows terminates execution.
' The "key" must be the first column in each set of data
' The number of columns in each list is determined by looking to the right for a blank cell.
' The number of rows in each list is determined by looking down in the first column for a blank cell.
' In other words, make sure the first row and column are fully populated, and include a blank row below and _
        blank column to the right of each list
'
Dim frmla1 As String, frmla2 As String
Dim addr1 As String, addr2 As String, sRows As String
Dim celHome As Range, rgA As Range, rgB As Range, rg1 As Range, rg2 As Range, rgSort As Range
Dim nCols1 As Long, nCols2 As Long, nRows As Long, nRows1 As Long, nRows2 As Long
Dim ws As Worksheet

Set celHome = ActiveCell    'Set a "return address" for the end of the macro
On Error Resume Next        'Turn error handling off. _
                                If you don't pick a range, the InputBox function will fail. _
                                If RSLT worksheet doesn't exist, then trying to set a worksheet variable to it will fail.
Set rgA = Application.InputBox("Please pick the top left cell in set 1", Type:=8)
If rgA Is Nothing Then Exit Sub     'No range was picked, so exit sub
Set rgB = Application.InputBox("Please pick the top left cell in set 2", Type:=8)
If rgB Is Nothing Then Exit Sub     'No range was picked, so exit sub
Set ws = Worksheets("RSLT")
If ws Is Nothing Then   'RSLT worksheet does not exist, so let's add one to this workbook
    Set ws = Worksheets.Add(After:=Worksheets(Worksheets.Count))    'Add the new sheet after the last worksheet in the workbook
    ws.Name = "RSLT"
End If
If ws.UsedRange.Rows.Count > 1 Then     'The RSLT worksheet already contains data. Ask user if it should be cleared.
    If MsgBox("The RSLT worksheet already contains data. Do you want to delete it?", vbYesNoCancel) <> vbYes Then Exit Sub
    ws.Cells.ClearContents      'Clear all existing data
End If
On Error GoTo 0         'Restore normal error handling

Application.ScreenUpdating = False      'Eliminate screen flicker while macro runs
nCols1 = 1      'Count the number of columns in the two lists. They don't have to be the same.
nCols2 = 1
If rgA.Cells(1, 2) <> "" Then nCols1 = rgA.End(xlToRight).Column - rgA.Column + 1   'The If test makes sure that List1 and List2 contain at least two columns of data
If rgB.Cells(1, 2) <> "" Then nCols2 = rgB.End(xlToRight).Column - rgB.Column + 1   'The .End(xlToRight) finds the cell to the left of the first blank in the row
    'Rows.Count is number of rows in the spreadsheet (varies with Excel version)
    'rgA.Worksheet returns the worksheet that contains List1; rgB.Worksheet returns the worksheet that contains List2 (need not be the same)
    'rgA.Column is column number of first cell in List1
nRows1 = rgA.End(xlDown).Row - rgA.Row + 1      'Look down from top of list to find last row in List1. Determine number of rows.
nRows2 = rgB.End(xlDown).Row - rgB.Row + 1      'Look down from top of list to find last row in List2. Determine number of rows.
nRows = IIf(nRows1 > nRows2, nRows1, nRows2)        'The number of rows in the longer list
Set rgA = rgA.Resize(nRows1, nCols1)       'All the data in List1. Resize grows the length of the range variable
Set rgB = rgB.Resize(nRows2, nCols2)       'All the data in List2. Resize could also grow the width, but that feature not needed here

With ws         'All objects or properties that begin with a . refer to worksheet RSLT
    rgA.Copy
    .Cells(1, 1).PasteSpecial xlPasteValuesAndNumberFormats     'Paste values and number formats, starting in cell A1
    Set rg1 = .Cells(1, 1).Resize(nRows1, 1)            'The first column of List1, which must contain the "key"
    
    rgB.Copy    'Paste List2 with three blank columns between List1 and List2
    .Cells(1, nCols1 + 4).PasteSpecial xlPasteValuesAndNumberFormats
    Set rg2 = .Cells(1, nCols1 + 4).Resize(nRows, 1)    'The first column of List2, which must contain the "key". If List1 is longer, pad with blank rows at bottom.
    Application.Goto .Cells(1, 1)  'Select cell A1 on the RSLT worksheet. This unselects the List2 cells that had been pasted.
End With

Set rgSort = rg2.Offset(0, -2).Resize(, 2)   'The two blank columns for the auxiliary formulas
addr1 = rg1.Address(ReferenceStyle:=xlR1C1)                 'Address of List1 first column in R1C1 format
addr2 = rgSort.Columns(1).Address(ReferenceStyle:=xlR1C1)   'Address of auxiliary formula columns in R1C1 format
sRows = "ROW(R1:R" & nRows & ")"        'The ROW function returns the numbers 1 through the number of rows in the longer list.
    '
    'Returns the index number in List1 where the key from List2 matches, or error value #N/A if no match was found
frmla1 = "=MATCH(RC[2]," & addr1 & ",0)"
    '
    'Array formula that inserts the "left-over" numbers into the results from frmla1 whenever no match was found
frmla2 = "=IF(ISNA(RC[-1]),SMALL(IF(ISNA(MATCH(" & sRows & "," & addr2 & ",0))," & sRows & _
    ",""""),COUNTIF(R" & rg2.Row & "C[-1]:RC[-1],""#N/A"")),RC[-1])"
rgSort.Columns(1).FormulaR1C1 = frmla1      'Put frmla1 in the first auxiliary column
rgSort.Cells(1, 2).FormulaArray = frmla2    'Put frmla2 in the first cell of second auxiliary column, as an array formula
rgSort.Columns(2).FillDown                  'Copy down the array formula

    'Sort the auxiliary columns and List2 by the results of frmla2. This will align List2 rows with their match from List1
rgSort.Resize(, 2 + nCols2).Sort Key1:=rgSort.Cells(1, 2), Order1:=xlAscending, Header:=xlNo
rgSort.Columns(2).EntireColumn.Delete      'Delete the second auxiliary column
rgSort.Columns(1).SpecialCells(xlCellTypeFormulas, xlNumbers).ClearContents
rgSort.Columns(1).SpecialCells(xlCellTypeFormulas, xlErrors).Value = "Mismatch"

Application.Goto celHome        'Return to the original cell that was active
Application.ScreenUpdating = True
End Sub

Open in new window

Avatar of Jerry L

ASKER

Brad, that's working great. Could you do one more thing on this?

Instead of prompting the user if the sheet 'RSLT' already exists, can you tell the script to create 'RSLT1', and if 'RSLT1' exists, then create 'RSLT2', etc. This will be especially helpful when I need to run several comparisons of data at one time.

Here's how I've seen it done in another script:

         Set myWB = Application.ActiveWorkbook
         strNewSheet = "RSLT" & myWB.Sheets.Count + 1
         etc.

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of byundt
byundt
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Jerry L

ASKER

That works great.
Avatar of Jerry L

ASKER

You might be interested in my new question: Q_25351342
Excel: Multiple Words Based on Percentage
Jerry,
Thanks for the kind words and grade!
Brad