Link to home
Start Free TrialLog in
Avatar of rsburge
rsburgeFlag for United States of America

asked on

Search Email Body for Specific Filepath & Save as Variable

Hello Experts - I am using the attached code in an Access database to search for filepaths and URL's in the body of emails.  In all cases up to now, I have always needed the first URL, so this has always worked.  Now, I have 3 filepaths and the one I need is not the first one.

I have tried several things, but don't know how to adjust this code to look for the filepath I need.

In the attached PDF, the filepath I need is the 3rd one or the one highlighted in yellow.

The beginning of the filepath is always the same ("D:\aLXE-Pricing\FifthThirdWhsl\WS"); the last part of the filename changes everytime I receive the email.

Thanks!

Public Function FT_Link2() As Boolean
SDdrive = Forms("frmFirewall").txtSDdriveLetter.Value
    Dim olApp, objFolder, objNameSpace
    Dim objItems, objMess As Object ' As MailItem
    'Dim Regex As Object, regm As Object ', IE As Object
    Dim ieLink, i As Long
    Dim regitm As Variant
    strFilter = "[Categories]=""53"""
 
    Set olApp = CreateObject("Outlook.Application")
    Set objNameSpace = olApp.GetNamespace("MAPI")
    Set objFolder = objNameSpace.GetDefaultFolder(6)
    objFolder.Items.Sort "[Received]", False
    Set objItems = objFolder.Items
    objItems.Sort "[Received]", True
    Set objMess = objItems.Find(strFilter)
    
    If objMess Is Nothing Then
'        MsgBox "No email found!", vbCritical
        FT_Link2 = True
        Exit Function
    End If
    
    Dim pth As String
    Set Regex = CreateObject("vbscript.regexp")
    Regex.Pattern = "(file:|C:)(//|\\\\)[^\.].+\.xlsx"  'Regex.Pattern = "file:(//|\\\\)[^\.]+\.xlsx"
    Regex.Global = True
           regitm = Regex.Test(objMess.Body)
                If regitm Then   '<-------- ERROR OCCURS ON THIS LINE
                    Set regm = Regex.Execute(objMess.Body)
                     attchnm = ""
                     attchnm2 = ""
                     attchnm = Right$(regm(0), Len(regm(0)) - InStr(regm(0), "D:\aLXE-Pricing\FifthThird\WS_") + 1)
                     'debug.print "Attachnm1 - " & attchnm
                     attchnm = Left(attchnm, InStr(attchnm, "xlsx") + 3)
                     'debug.print "Attachnm1 - " & attchnm
                End If
    Set regm = Nothing
    Set Regex = Nothing
    Set objMess = Nothing
    Set objItems = Nothing
    Set objFolder = Nothing
    Set objNameSpace = Nothing
    Set olApp = Nothing
End Function

Open in new window

SampleEmail.pdf
Avatar of Chris Raisin
Chris Raisin
Flag of Australia image

Working on this for you.

What version of Access are you using and what references do you have incorporated in your code?

Cheers
Chris
(craisin-Australia)
Avatar of rsburge

ASKER

Hi - Good to hear from you.  :)

I am using Access 2007 and I have lots of references; are you looking for any specific references?

Thanks!
Renee
Specifically for Regex

Cheers
Chris
Avatar of rsburge

ASKER

Hi - The references for this code are below.

Public Regex As Object
Public regm As Object
Dim regitm As Variant
No, that is not what I meant, but I suddenly noticed the line:

                 Set Regex = CreateObject("vbscript.regexp")

which means that Regex is not a reference in a library, but rather a script as part of the
vbScript (which WILL be in your reference listing as "Microsoft.Scripting.Runtime"
as per below. So that answers my question.

Stand BY...

Cheers
Chris

User generated image
OK...I have a solution for you.

But how will you know when you tun the code you have supplied whether you want the first, second, third (etc.) instance of the string found?

Do you want the macro to ask each rime which attachment, or do you want it to process EACH file attachment it has?

Another alternative (although less efficient) is to have a macro that will only process the THIRD attachment (if there is one).

Or perhaps the file is always containing a filename in a format like the following?
                                       "WS_041612_EXTR.xlsx"

I am findiong it difficult to run the code since my Outlook is not set up the same way as yours anyway, I will supply the code to you once I know the answer to my question and you can test it out for me :-).

Cheers
Chris
Are there actual attachments on your email at the time the macro runs?

Also I notice the the pattern is:      (file:|C:)(//|\\\\)[^\.].+\.xlsx"
and yet the files appear to have been stored on drive D:

Please advise along with response to previous question(s).

Cheers
Chris
One final thing (I am never short for words! :-)  )

I cannot get the pattern in the line:

                          Regex.Pattern = "(file:|C:)(//|\\\\)[^\.].+\.xlsx"
to find an occurrence in the body of the email you posted as a PDF at the top of this question.

I changed the "C:" to "D:" since that is the drive showing in the body of that  email text, but that still did not work.

Are you sure the pattern is correct? (I changed it to a simple ".xlsx" and it found it, so I am wondering whether the pattern is correct for the files you have in the body of this message?

Cheers
Chris
OK...

I found if I change the string to the following it worked:

                          Regex.Pattern = "(D:)(//|\\|\\\\)[^\.].+\.xlsx"

but it only found one occurrence!

I think that may be due to the fact that the second filename "wraps" onto the next line.

Stand by while I try to see how we can get around that

Cheers
Chris
I just found a TERRIFIC site for testing your regular expression, and it helped me find the bug in your current expression.

The site is: http://www.rubular.com/

Anyway, the expression I worked out now is to show in the code as:

            Regex.pattern = "(file:|C:|D:|E:)(//|\\|\\\\)[^\.]+\.xlsx"

This will find ALL files ending in ".xlsx" which start with "file://" or "C:\" or D:\" or "E:\"
(you get my drift) and spans across multiple lines as well (I removed the "dot" after the last closing square bracket showing as    ]

The code I have is as follows.

It is a a test code only to show how it picks up each file.

Now we just have to decide whether it is always the LAST file that you are after, the THIRD file (if there is one) or WHATEVER.

Once you let me know which condition you want, we can tweak the code to suit.
(Notice that your manipulation of "attchnm" in the original code is no longer needed).

I was wondering  why you have set "FT_Link2 = True" when the search fails? I don't know, but I have left it that way since you must have a reason.

I also omitted the line 21 in your original code "Exit Function" since that bypassed the cleanup code at the end which should always be performed. The solution is to jump to the cleanup code at the end, and the procedure will then terminate after that anyway.

Perhaps this working bit of code is all you need now to change your code where required!


Cheers
Chris

Option Compare Database
Option Explicit
Private Sub main()
   FT_Link2
End Sub

Public Function FT_Link2(Optional nInst As Integer) As Boolean
    'SDdrive = Forms("frmFirewall").txtSDdriveLetter.Value
    
    Dim Match As Object
    
    Dim objFolder As Object
    Dim objItems As Object
    Dim objMess As Object
    Dim objNameSpace As Object
    Dim olApp As Object
    Dim RegEx As Object
    Dim Regm As Object
    
    Dim attchnm() As String
    Dim nItem As Integer
    Dim RegItm As Variant
    Dim strFilter As String
    
    
    strFilter = "[Categories]=""53"""
 
    Set olApp = CreateObject("Outlook.Application")
    Set objNameSpace = olApp.GetNamespace("MAPI")
    Set objFolder = objNameSpace.GetDefaultFolder(6)
    objFolder.Items.Sort "[Received]", False
    Set objItems = objFolder.Items
    objItems.Sort "[Received]", True
    Set objMess = objItems.Find(strFilter)
    
    If objMess Is Nothing Then
      MsgBox "No email found!", vbCritical
      FT_Link2 = True
      GoTo CleanUp
    End If
    
    Dim pth As String
    Set RegEx = CreateObject("vbscript.regexp")
    RegEx.Global = True
    RegEx.IgnoreCase = True
    RegEx.pattern = "(file:|C:|D:|E:)(//|\\|\\\\)[^\.]+\.xlsx"
    RegItm = RegEx.Test(objMess.body)
    If RegItm Then
      Set Regm = RegEx.Execute(objMess.body)
      ReDim attchnm(Regm.Count)
      For nItem = 0 To Regm.Count - 1
         attchnm(nItem) = Replace(Regm(nItem), vbCrLf, "")
         MsgBox attchnm(nItem)
         Debug.Print "Attachnm " + CStr(nItem) + " - " & attchnm(nItem)
       Next
    Else
      'Uncomment the following line if you require this message
      MsgBox "Pattern not found in email!", vbCritical
      FT_Link2 = True
    End If
    GoTo CleanUp

CleanUp:
    Set objFolder = Nothing
    Set objItems = Nothing
    Set objMess = Nothing
    Set objNameSpace = Nothing
    Set olApp = Nothing
    Set RegEx = Nothing
    Set Regm = Nothing
End Function

Open in new window

Avatar of rsburge

ASKER

Hi - Sorry for the delay...  Servers were down for a while and that was a nightmare to get fixed.  :)

To answer your questions...

1.  The file I need may not always be the third file or even the last fle, but the file path and first two letters of the filename will always be the same; i.e. ("D:\aLXE-Pricing\FifthThirdWhsl\WS".

2.  I do have a program (Auto-Mate Pro) which sets the categories and removes the attachments and saves them to the locations shown in the sample email. There are no actual attachments when this code runs, only the location of the files.

3.  I was running into the same problem with my code in that in only found the first instance of the .xlsx, that was the main reason for my questions; figuring out how to get a file based on the name rather than first matching instance of xlsx.

4.  The FT_Link2 returns True if the search fails because in the parent code, IF FT_Link2 = True, then the code is stopped because the file is not in the email.

I think this is all of your questions.  I will test this code you have posted as soon as I recover from the downed server debacle.  :)

Thanks!
Renee
OK Thats fine, but in the sample email there were TWO .xlxs files with that path.

Did you want both files processed or only one?

The files were:
 D:\aLXE-Pricing\FifthThirdWhsl\Correspondent.pdf]
 D:\aLXE-Pricing\FifthThirdWhsl\Correspondent Rate Sheet041612.xlsx]
 D:\aLXE-Pricing\FifthThirdWhsl\WS_041612_EXTR.xlsx]

Also, the code you gave does not actually do anything as far as I can see.
It just seems to send comments to the Debug Log via "Debug.Print"

Cheers
Chris
Avatar of rsburge

ASKER

The code I supplied stores the full file path in the variable attachnm.  I then use that in the parent code.

There are 2 files with the same path, but the actual file names are different...  One starting with Correspondent and the other starting with WS.

Is it possible to get only the one with the file name starting with WS and store that full filepath in attachnm?

attachnm = D:\aLXE-Pricing\FifthThirdWhsl\WS_041612_EXTR.xlsx

In the parent code, I use attachnm to open that excel file at a later point in the code; i.e. xlwb = xlapp.workbooks.open(attachnm)
ASKER CERTIFIED SOLUTION
Avatar of Chris Raisin
Chris Raisin
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of rsburge

ASKER

The code works great, I am just not sure how to get just the instance that has the filename that starts with WS.

When I run the code I get the results below in the immediate window.  I need to figure out how to "store" only Attachnm4 in a variable that I can use later in the parent code.  It won't always be the last file, but it the file name will always start with WS_

Attachnm 1 - file://D:\aLXE-Pricing\FifthThirdWhsl\Correspondent Rate Sheet 041712.xlsx
Attachnm 2 - D:\aLXE-Pricing\FifthThirdWhsl\Correspondent Rate Sheet 041712.xlsx
Attachnm 3 - file://D:\aLXE-Pricing\FifthThirdWhsl\WS_041712_EXTR.xlsx
Attachnm 4 - D:\aLXE-Pricing\FifthThirdWhsl\WS_041712_EXTR.xlsx
Excuse me, but arent the files REPEATED there? There appears to be two distinct files but they must be showing in the email as both "file://"  AND "D:\"?

How are you getting the value of your variable back to your other program? Is it a "Global" variable? If not, then you could either make the gvariable global so all cxode sees it (define it at the VERY top of your macros module as "Public attachnm as string") or
get your function to return a string (the filename) instead of Boolean.

In your calling code then, if your function  FT_Link2() returns "" (empty string) it is the same as "True" but if it returns a filename it is the same as False. (Just another option)

In the meantime we can tweak the filter if you can guarantee that there only ever be ONE file with the WS in it. I will work on that assumption at the moment and tweak the filter for you.

In the meantime, please advise why each file is showing twice in the output.



Stand By.

Cheers
Chris
Avatar of rsburge

ASKER

Hi - The emails are showing exaclty as the sample I sent, but they are a little strange in that when you look at them, they only show the D:\...
However, in the email itself, if you hover over the hyperlink to that file, it shows file:///D\:...

I believe this is why it shows twice.


attachnm is a global variable, but I can create another one, ftFile since you have used attachnm in this code in a different way.

There truely will only be one file in the email with WS in the filename.  The other files would come over with various other names, but in 5 years, I haven't seen any other attachments with WS in the filename.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of rsburge

ASKER

Geeze!!  Thanks for all of your help on this...  The Co-Location is offline again so I can't test until it comes back up.  I will do that ASAP and get back to you.

Thanks!
Renee
Avatar of rsburge

ASKER

That works perfectly!  Thank you!!

Renee