How to do this in VBA and Excel: <<set target = Regex.execute(ActiveSheet.EntireSheet)>>
I want to use regular expressions to grab a large list of street addresses from an Excel worksheet.
It is easy to do this when the lists are in a text file, but I don't see how to do the same with ActiveSheet.
I plan to save the sheet as a csv file then scan that, but I wonder if somebody has a better idea.
The spreadsheets are very big and performance is important, so I don't want to use iterations like For each cell in sheet.cells.
For instance, the following code scans a text file and grabs a large list. I would like to replace lines 2 to 4 with something that gets the entire activesheet quickly.
Sub t753() Open "C:\Users\bob.berke\Desktop\My Master.txt" For Input As #1 st = Input$(LOF(1), 1) Close #1Dim regex As New regexpDim ansregex.Global = Trueregex.MultiLine = Trueregex.IgnoreCase = Trueregex.pattern = "(\t\d{3} \d{3})(.*?)(fax:)" ' <=== the \t means "tab", which is roughly equivalent to Excel's "beginning of cell"Set match = regex.Execute(st)ans = ""For Each var In match ans = ans & vbCrLf & var Debug.Print varNextans = Mid(ans, 3)Dim dobj As New DataObjectdobj.SetText ansdobj.PutInClipboard' clipboard will have hundreds of lines that look like <<123 456 City Hall, Cleveland, OH 44101 Phone:216-505-123 Fax: >>End Sub
Have you speed tested loading the sheet into an array and looping through that? It will be much faster than reading cell by cell.
Robert Berke
ASKER
The spreadsheets contain cells that came from an internet web page
When the pattern "(\t\d{3} \d{3})(.*?)(fax:)" returns 100 matches, each of the 100 elementsmight come from multiple rows.
For instance match (5) might come from 2 rows and match(6) might come from a single row.
The sheet has lots of unrelated garbage between the rows that has to be ignored.
so, for Any type of iteration to work, it would first need to concatenate every cell to build a huge concatenated string. Starting the concatenation with ary = activesheet.usedrange would certainly speed up the process, but I expect saving the spreadsheet to a .csv would be faster still.