Need someone to modify this VBA code to handle Internet multiline header fields

I need the below code modified to handle Internet headers that are "folded" into multiple lines per
'           /* Rfc 2822 2.2 Header Fields
'               Header fields are lines composed of a field name, followed by a colon
'               (":"), followed by a field body, and terminated by CRLF.  A field
'               name MUST be composed of printable US-ASCII characters (i.e.,
'               characters that have values between 33 and 126, inclusive), except
'               colon.  A field body may be composed of any US-ASCII characters,
'               except for CR and LF.  However, a field body may contain CRLF when
'               used in header "folding" and  "unfolding" as described in section
'               2.2.3.  All field bodies MUST conform to the syntax described in
'               sections 3 and 4 of this standard.
'
'              Rfc 2822 2.2.3 (Multiline header fields)
'               The process of moving from this folded multiple-line representation
'               of a header field to its single line representation is called
'               "unfolding". Unfolding is accomplished by simply removing any CRLF
'               that is immediately followed by WSP.  Each header field should be
'               treated in its unfolded form for further syntactic and semantic
'               evaluation.
'
'               Example:
'                   Subject: aaaaa<CRLF>
'                   <TAB or SP>aaaaa<CRLF>
'

I have attached 2 sample headers (email addresses are not real); the "To:" header is the header of interest.

Basically, I need the code modified to "unfold" multiline header fields by removing CRLF when it is immediately followed by whitespace.  This needs to be done BEFORE parsing the words to decode.

Thanks for any help--
Sam

Option Explicit

Const C_SEP = " "
Const C_EQUAL = "="
Const C_QUEST = "?"
'

Sub TestAllDecode()
    Dim arrh: arrh = Array( _
        "Test: =?utf-8?b?SWhyZSBCZXN0ZWxsdW5nIC0gVmVyc2FuZGJlc3TDpHRpZ3VuZyAtIDExMDU4OTEyNDY=?=", _
        "Subject: =?utf-8?Q?Your=20Weekly=20HAR=205=2DMinute=20REALTOR=C2=AE=3A=20Posts=20for=20the=20week=20of=2003=2F06=2F2012?=", _
        "To: =?utf-8?Q?Fred?= <HAR@Equity.com>" _
    )
    Dim strH, strW, d
    For Each strH In arrh
        d = ""
        For Each strW In Split(strH, C_SEP)
            If d <> "" Then d = d & C_SEP
            d = d & mhdecode(strW)
        Next
        Debug.Print d
    Next
End Sub

' TODO:
' * "_" => space
' * first word always unencoded?
' * comments in parentheses for specific fields?

Function mhdecode(ByVal s As String) As String
    Dim hdr As String: hdr = C_EQUAL & C_QUEST ' "=?utf-8?"
    Dim ftr As String: ftr = C_QUEST & C_EQUAL
    Dim lenS As Integer: lenS = Len(s)
    If Left(s, Len(hdr)) = hdr And Right(s, Len(ftr)) = ftr Then
        Dim p1 As Integer: p1 = InStr(Len(hdr) + 1, s, C_QUEST)
        If p1 > Len(hdr) + 1 And p1 < lenS - 2 Then
            Dim p2 As Integer: p2 = InStr(p1 + 1, s, C_QUEST)
            If p2 > p1 + 1 And p2 < lenS - 2 Then
                Dim encCharSet As String: encCharSet = Mid(s, Len(hdr) + 1, p1 - Len(hdr) - 1)
                Dim encType As String: encType = Mid(s, p1 + 1, p2 - p1 - 1)
                Dim Message: Set Message = CreateObject("CDO.Message")
                Select Case UCase(encType)
                    Case "Q"
                        Message.BodyPart.ContentTransferEncoding = "quoted-printable"
                    Case "B"
                        Message.BodyPart.ContentTransferEncoding = "base64"
                    Case Else
                        Message.BodyPart.ContentTransferEncoding = "us-ascii" ' ?
                End Select
                Dim Stream 'As ADODB.Stream
                Set Stream = Message.BodyPart.GetEncodedContentStream
                'Set charset To base windows charset
                Stream.Charset = "windows-1250"
                Stream.WriteText Mid(s, p2 + 1, lenS - p2 - Len(ftr))
                Stream.Flush
                Stream.Close
                Set Stream = Nothing
                Set Stream = Message.BodyPart.GetDecodedContentStream
                Stream.Charset = encCharSet
                mhdecode = Stream.ReadText
                Stream.Close
                Set Stream = Nothing
            End If
        End If
    Else
        mhdecode = s
    End If
End Function

Open in new window


Code provided by robert_schutt (Thanks again!)
Header1.txt
Header2.txt
SAbboushiAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

CodeCruiserCommented:
0
SAbboushiAuthor Commented:
I don't believe it is that simple...

Anyone else?  Again, looking for someone to modify the code ; )
0
Robert SchuttSoftware EngineerCommented:
Here's what I ended up doing. The reason for the double replace with Regular Expressions instead of just a normal Replace is that I think it's important to treat these cases differently:

1) need to take out spacing between the encoded words:

Subject: =?UTF-8?Q?Well=2C_let=27s_try_a_subject_with_s?=
	=?UTF-8?Q?=C3=96me_str=C4=83nge_characters_and_a_bit?=
	=?UTF-8?Q?_longer_so_it_may_span_more_than?=
	=?UTF-8?Q?_1_line_=E2=98=BA_and_how_about_graphics?=
	=?UTF-8?Q?=3F_=C2=B6?=

Open in new window


2) need to retain some form of spacing (I chose 1 space):

Received: from [70.101.42.90] (70-101-42-90.dsl2-plymouth.roc.ny.frontiernet.net [70.101.42.90])
	by relay04.roch.ny.frontiernet.net (Postfix) with ESMTP id E300ECCDF2;
	Thu, 11 Dec 2008 02:47:00 +0000 (UTC)

Open in new window


The code:

Option Explicit

Const C_TEMP_PATH = "C:\temp\"

Const C_EQUAL = "="
Const C_QUEST = "?"

Sub TestAllDecode()
    Dim arrh: arrh = Array( _
        "Test: =?utf-8?b?SWhyZSBCZXN0ZWxsdW5nIC0gVmVyc2FuZGJlc3TDpHRpZ3VuZyAtIDExMDU4OTEyNDY=?=", _
        "Subject: =?utf-8?Q?Your=20Weekly=20HAR=205=2DMinute=20REALTOR=C2=AE=3A=20Posts=20for=20the=20week=20of=2003=2F06=2F2012?=", _
        "To: =?utf-8?Q?Fred?= <HAR@Equity.com>" _
    )
    Dim strH
    For Each strH In arrh
        Debug.Print DecodeHeaders(strH)
    Next
    DecodeFileWithHeaders "Header1.txt"
    DecodeFileWithHeaders "Header2.txt"
    'DecodeFileWithHeaders "Header3.txt"
End Sub

Function DecodeHeaders(ByVal strH As String) As String
    Dim res As String, ps As Integer
    res = ""
    ps = 1
    Dim re As New RegExp
    re.Global = True
    re.MultiLine = False
    re.IgnoreCase = True
    re.Pattern = "=" & vbCrLf & "[ \t]+="
    strH = re.Replace(strH, "==")
    re.Pattern = vbCrLf & "[ \t]+"
    strH = re.Replace(strH, " ")
    re.Pattern = "=\?[a-z0-9-]+\?[BQ]\?.+?\?="
    Dim m As Match
    For Each m In re.Execute(strH)
        res = res & Mid(strH, ps, m.FirstIndex + 1 - ps) & mhdecode(m.Value)
        ps = m.FirstIndex + m.Length + 1
    Next
    res = res & Mid(strH, ps)
    DecodeHeaders = res
End Function

Sub DecodeFileWithHeaders(fn As String)
    Debug.Print "--- file: " & fn
    Dim oFSO As Object, oFile As Object, c As String
    Set oFSO = CreateObject("Scripting.FileSystemObject")
    Set oFile = oFSO.OpenTextFile(C_TEMP_PATH & fn, 1, False)
    c = oFile.ReadAll()
    oFile.Close
    Set oFile = Nothing
    Set oFSO = Nothing
    Debug.Print DecodeHeaders(c)
End Sub

Function mhdecode(ByVal s As String) As String
    Dim hdr As String: hdr = C_EQUAL & C_QUEST
    Dim ftr As String: ftr = C_QUEST & C_EQUAL
    Dim lenS As Integer: lenS = Len(s)
    If Left(s, Len(hdr)) = hdr And Right(s, Len(ftr)) = ftr Then
        Dim p1 As Integer: p1 = InStr(Len(hdr) + 1, s, C_QUEST)
        If p1 > Len(hdr) + 1 And p1 < lenS - 2 Then
            Dim p2 As Integer: p2 = InStr(p1 + 1, s, C_QUEST)
            If p2 > p1 + 1 And p2 < lenS - 2 Then
                Dim encCharSet As String: encCharSet = Mid(s, Len(hdr) + 1, p1 - Len(hdr) - 1)
                Dim encType As String: encType = Mid(s, p1 + 1, p2 - p1 - 1)
                Dim Message: Set Message = CreateObject("CDO.Message")
                Select Case UCase(encType)
                    Case "Q"
                        Message.BodyPart.ContentTransferEncoding = "quoted-printable"
                        s = Replace(s, "_", " ")
                    Case "B"
                        Message.BodyPart.ContentTransferEncoding = "base64"
                End Select
                Dim Stream 'As ADODB.Stream
                Set Stream = Message.BodyPart.GetEncodedContentStream
                'Set charset To base windows charset
                Stream.CharSet = "windows-1250"
                Stream.WriteText Mid(s, p2 + 1, lenS - p2 - Len(ftr))
                Stream.Flush
                Stream.Close
                Set Stream = Nothing
                Set Stream = Message.BodyPart.GetDecodedContentStream
                Stream.CharSet = encCharSet
                mhdecode = Stream.ReadText
                Stream.Close
                Set Stream = Nothing
            End If
        End If
    Else
        mhdecode = s
    End If
End Function

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
SAbboushiAuthor Commented:
Absolutely Awesome!
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Outlook

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.