Ok, I had a previous post requesting removal of rows containing duplicate values. This was answered promptly and well (thank you imnorie). Now I need to remove those duplicate values without removing the cells/rows containing only numerical data.

Initially I thought of copying the numerical data out of the document, sorting it and then putting it back in. But my inexperience got the better of me. Below is the script so far. If someone could lend me a hand, I'd really appreciate the help.

Cheers
G_M

Public Sub SortDocumentInExcel() Dim strWorkingFile As String Dim objExcel As Excel.Application Dim objWorkbook As Excel.Workbook Dim objWorkSheet As Excel.Worksheet Dim objExcelRange As Excel.Range Dim LastRow As Long strWorkingFile = frmGetReferences.lblOutputFolder.Text & "wf_" & frmGetReferences.txtOutputFilename.Text & ".xlsx" objExcel = CreateObject("Excel.Application") objWorkbook = objExcel.Workbooks.Open(strWorkingFile) objWorkSheet = objWorkbook.Sheets("sheet1") ' Sort the data in the "A" column; expanding selection to other columns objWorkSheet.Cells.Sort(Key1:=objWorkSheet.Range("A2"), _ Order1:=Excel.XlSortOrder.xlAscending, Header:=Excel.XlYesNoGuess.xlYes, _ Orientation:=Excel.XlSortOrientation.xlSortColumns, SortMethod:=Excel.XlSortMethod.xlPinYin) 'Remove rows based on duplicate values from "A" column LastRow = objWorkSheet.Range("A" & objWorkSheet.Rows.Count).End(Excel.XlDirection.xlUp).Row objExcelRange = objWorkSheet.Range("A1").Resize(LastRow, 3) objExcelRange.RemoveDuplicates(Columns:=1, Header:=Excel.XlYesNoGuess.xlYes) objWorkbook.Save() objWorkbook.Close() objExcel.Quit() objExcel = Nothing objWorkbook = Nothing objWorkSheet = Nothing End Sub

I've come up with something based on my idea from yesterday but as far as I can see the data in the sample file doesn't have any duplicates.
I tried it on the sample data you posted here and it seemed

This is the unsorted data:
A B C
1 Reference Page Document
2 2002 Page 1 test.docx
3 2002 Page 2 test.docx
4 1999 Page 1 test.docx
5 Test Data Page 3 test.docx
6 Test Data Page 3 test.docx
7 Test Data, 2002 Page 5 test.docx

This is the sorted data:
A B C
1 Reference Page Document
2 1999 Page 1 test.docx
3 2002 Page 2 test.docx
4 2002 Page 1 test.docx
5 Test Data Page 3 test.docx
6 Test Data, 2002 Page 3 test.docx

Ok imnorie, I used your idea (sort of) identifying if the data in the A column contained a a numerical value by inserting the following equation into column D:

=IF(ISNUMBER(A#), "Yes", "NO")

I then attempt to extract the data to a second workbook called "wf2_" & strWorkingFileTo

I have managed to copy the first two values into the new document; however, after that I get an error with the objExcelRange.offset(1) increment that I have created.

The intent is as follows:
1. Sort the data in "A" column in objWorkbook Alphabetically (done)
2. Move/Cut all the rows containing numbers in "A" column of objWorkbook into objWorkbookNew (errors in script)
3. Remove duplicates from "A" column in objWorkbook (done)
4. Move/Cut the remaining data in objWorkbook to objWorkbookNew
5. Delete objWorkbook

I have gone around in circles with this and again, my inexperience got the better of me.
I'm sure there is something small I am missing here. I have attached some sample data and the code to the point I have described above.

Hope someone can help.
Cheers
G_M

Public Sub SortDocumentInExcel() Dim objExcel As Excel.Application Dim strWorkingFileFrom As String Dim objWorkbook As Excel.Workbook Dim objWorkSheet As Excel.Worksheet Dim objExcelRange As Excel.Range Dim strWorkingFileTo As String Dim objWorkbookNew As Excel.Workbook Dim objWorkSheetNew As Excel.Worksheet Dim objExcelRange2 As Excel.Range Dim objExcelRange3 As Excel.Range Dim LastRow As Long Dim misValue As Object = System.Reflection.Missing.Value 'Values for original Excel document... The document with the raw data strWorkingFileFrom = frmGetReferences.lblOutputFolder.Text & "wf_" & frmGetReferences.txtOutputFilename.Text & ".xlsx" objExcel = CreateObject("Excel.Application") objWorkbook = objExcel.Workbooks.Open(strWorkingFileFrom) objWorkSheet = objWorkbook.Sheets("sheet1") objExcelRange = objWorkSheet.Range("A2") 'Values for the new Excel document... The document with the filtered data strWorkingFileTo = frmGetReferences.lblOutputFolder.Text & "wf2_" & frmGetReferences.txtOutputFilename.Text & ".xlsx" objWorkbookNew = objExcel.Workbooks.Add(misValue) objWorkSheetNew = objWorkbookNew.Sheets("sheet1") objWorkSheetNew.SaveAs(strWorkingFileTo) objExcelRange2 = objWorkSheetNew.Range("A" & objWorkSheetNew.Rows.Count).End(Excel.XlDirection.xlUp) 'Sort the data in the "A" column; expanding selection to other columns objWorkSheet.Cells.Sort(Key1:=objWorkSheet.Range("A2"), _ Order1:=Excel.XlSortOrder.xlAscending, Header:=Excel.XlYesNoGuess.xlYes, _ Orientation:=Excel.XlSortOrientation.xlSortColumns, SortMethod:=Excel.XlSortMethod.xlPinYin) 'Extract data if column "D" is equal to "Yes"... Extracts numeric data only Do While objExcelRange.Value <> 0 If objExcelRange.Row <> 1 Then If objExcelRange.Offset(, 3).Value = "Yes" Then objExcelRange.Cut(objExcelRange2) 'objExcelRange2.Paste() objWorkbookNew.Save() 'objExcelRange = objExcelRange.Offset(1) objExcelRange2 = objExcelRange2.Offset(1) Else Exit Do End If objExcelRange = objExcelRange.Offset(1) End If Loop 'Remove rows based on duplicate values from "A" column LastRow = objWorkSheet.Range("A" & objWorkSheet.Rows.Count).End(Excel.XlDirection.xlUp).Row objExcelRange3 = objWorkSheet.Range("A1").Resize(LastRow, 3) objExcelRange3.RemoveDuplicates(Columns:=1, Header:=Excel.XlYesNoGuess.xlYes) objWorkbookNew.Save() objWorkbookNew.Close() objWorkbook.Save() objWorkbook.Close() objExcel.Quit() objExcel = Nothing objWorkbook = Nothing objWorkSheet = Nothing End SubEnd Module

Hmmm... I didn't really know how to handle it with one workbook and though it easier to use two. But, being a novice leaves what ever you think appropriate as best :o) I am working with the intent to eventually process this data to three different formats at the end... .docx, .txt and .xlsx. I am hoping to pick up enough knowledge along the way to get this done.

I don't really care if I keep the origainal data, I just want to filter out what is not needed. This working file will eventually be deleted; however, it would probably be handy for debugging my mistakes.

Looking into it further, I seem to have an issue with my Do While Loop and the string data from the workbook isn't validating. I feel like I'm so close though.

I was just about to work all that out... Thank you again imnorie

0

Write Comment

By clicking you are agreeing to Experts Exchange's Terms of Use.

Featured Post

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Workbook link problems after copying tabs to a new workbook?
David Miller (dlmille)
Intro
Have you either copied sheets to a new workbook, and after having saved and opened that workbook, you find that there are links back to the original souâ€¦