Solved

pulling html code out of the body of the message.

Posted on 2003-10-23
8
428 Views
Last Modified: 2013-12-18
I am trying to script something or export all the messages in my mailbox to give me the body of the message unaltered by Notes.  If you look at document properties the body tag can contain HTML code but when opened Notes strips that out.  :-(  I am trying to take 5000 some email and put them into one text document and pull out any HTML tags for further processing and adding to our spam filter,  This is becoming a big pain in the rear.  

-Brian
0
Comment
Question by:bkirk007
  • 3
  • 2
8 Comments
 
LVL 24

Accepted Solution

by:
HemanthaKumar earned 125 total points
ID: 9611091
First of all there is no straight forward method in R5. R6 I am not sure, but it has enhanced classes to handle mime type fields more elegantly.

So, to make it easy I would suggest you to use HTTP stream to obtain the content in html. How ?

Two possibilities, you can use java io stream to do that and second use GetDocumentByURL method to get html. For both these methods http task should be up and running.

Read this forum's question for more details on this technique.
http://experts-exchange.com/Applications/Email/Lotus_Notes/Q_20772364.html
~Hemanth
0
 
LVL 31

Expert Comment

by:qwaletee
ID: 9611610
Do you need just the bodies?  You canuse the MIME_PART type on Body (only available when you set Convertime=False).  Then, just recursively pul all the parts and spit them back out. I have some sample code...
0
 
LVL 31

Assisted Solution

by:qwaletee
qwaletee earned 125 total points
ID: 9611621
This is hard coded for teh currently open NotesUiDocument, and hard-coded for c:\mime.txt.

It should not be difficult to adapt it to work with a NotesDocument cllection, and to name the otput file for each document based on the NoteID or UNID.

I would not dump all to a single file, since you will run into MIME parsing problems.


      Dim ws As New notesUiWorkspace
      Set doc = ws.currentDocument.document
      
      Dim s As New notesSession
      s.ConvertMime = False
      
      Dim body As notesItem
      Set body = doc.getFirstItem("Body")
      
      If body.type = MIME_PART Then
            Open "c:\mime.txt" For Output As #1
            Print #1 , "Mime-Version 1.0"
            Print #1 ,
            Print #1 , {This is a MIME-formatted message.  If you see this text it means that your
E-mail software does not support MIME-formatted messages.  Notes RTF MIME dump by Dovid Gross.}
            Print #1 ,
            Print #1 , "--" HeadersWIthBoundary ( body.getMimeEntity , Null , "" , doc.noteID) 'mime, headers-for-return, default boundary, filename
            DumpMIME body.getMimeEntity , "" , doc.universalID 'mime , level.level , boundary
            Close #1
      Else
            isComplexMIME = False
      End If
      
      
      Stop
      s.ConvertMime = True
      Delete body
      
      If Not isComplexMIME Then
            Dim text As String
            Set body = doc.getFirstItem("Body")
            If body.type = Richtext Then
                  Dim rtf As notesRichTextItem
                  Set rtf = Body
                  text = rtf.getFormattedText(True,72)
            Else
                  text = body.text
            End If
            Dim pos As Integer , nl As String
            nl = Chr$(13) + Chr$(10)
            pos = Instr(text,nl)
            While pos
                  Dim two As String*2
                  two = Mid$(text,pos+2,2)
                  If two <> nl And two <> "==" Then
                        two = Mid$(text,pos-2,2)
                        If two <> "==" Then
                              Mid$(text,pos,2) = "  "
                        End If
                  End If
                  pos = Instr(pos+2,text,nl)
            Wend
            Set doc = New notesDocument ( doc.parentDatabase )
            doc.form = "memo"
            doc.body = text
            ws.editDocument True , doc
      End If


Sub DumpMIME ( mime As notesMimeEntity , Byval level As String , boundary As String)
      
      Dim count As Integer
      
      If mime Is Nothing Then
            Print #1 ,
            Print #1 , ";(Empty MIME part - " & level & "x)"
            Print #1 ,
            Exit Sub
      End If
      
      If level <> "" Then
            isComplexMIME = True
      End If
      
      Do Until mime Is Nothing
            count = count + 1
            
            If (count > 1) Then
                  isComplexMIME = True
            End If
            
            Dim headers As String
            boundary = HeadersWithBoundary ( mime , headers , boundary , level & count)
            If (mime.contentType <> "text") Or (mime.contentSubType <> "plain") Then
                  isComplexMIME = True
            End If
            
            Print #1 ,
            Print #1 , "--" boundary
            Print #1 , ";Part " & level & count
            Print #1,  ";Content type:" & Chr(9) & mime.ContentType
            Print #1 , ";Content subtype:" & Chr(9) & mime.ContentSubtype
            Print #1 , ";Headers..."
            
            Print #1 ,  Headers
            Print #1 ,
            Print #1 , mime.ContentAsText
            Print #1 ,
            
            Print #1 , ";Children..."
            Print #1 ,
            DumpMIME mime.getFirstChildEntity , level & count & "." , boundary
            Set mime = mime.getNextSibling
      Loop
      Print #1 , "--" boundary "--"
End Sub
Function HeadersWithBoundary ( mime As notesMimeEntity , headers As Variant , Byval boundary As String , id As String) As String
      doc.tempEvaluate = mime.Headers
      Dim newHeaders As Variant
      newHeaders = Evaluate({@ReplaceSubString(tempEvaluate; @Char(13); @Char(13) + @Char(10))} , doc )
      doc.removeItem "tempEvaluate"
      headers = newHeaders(0)
      Const boundMark = {Boundary="}
      Dim boundStart As Integer , boundEnd As Integer, boundSize
      boundStart = Instr(headers,boundMark)
      If boundStart Then
            boundStart = boundStart + Len(boundMark)
            boundEnd = Instr(boundStart , headers , {"})
            If boundEnd Then
                  boundSize = boundEnd - boundStart
                  boundary = Mid$(headers , boundStart , boundSize )
            Else
                  boundary = Mid$(headers , boundStart )
            End If
      Else
            boundary = boundary & "___" & doc.universalID
      End If
      
      Const typeMark = {Content-type: }
      boundStart = Instr(headers,typeMark)
      If boundStart Then
            boundStart = boundStart + Len(typeMark)
            boundEnd = Instr(boundStart,headers,"; ")
            boundSize = Instr(boundStart,headers," ")
            If boundEnd < 1 Or (boundSize < boundEnd And boundSize > 0) Then
                  boundEnd = boundSize - 1
                  Print "space as marker " id
            End If
            boundSize = Instr(boundStart,headers,Chr$(10))
            If boundEnd < 1 Or (boundSize < boundEnd And boundSize > 0) Then
                  boundEnd = boundSize -1
                  Print "lf as marker " id
            End If
            boundSize = Instr(boundStart,headers,Chr$(13))
            If boundEnd < 1 Or (boundSize < boundEnd And boundSize > 0) Then
                  boundEnd = boundSize - 1
                  Print "cr as marker " id
            End If
            If boundEnd = 0 Then
                  headers = headers & "; name=" & id & ".txt"
                  Print "no marker " id
            Else
                  headers = Left$(headers, boundEnd) & " ; name=" & id & ".txt; " & Mid$(headers, boundEnd + 1)
            End If
      Else
            Print "Unable to insert " id
      End If
      
      HeadersWithBoundary = boundary
End Function
0
 
LVL 24

Expert Comment

by:HemanthaKumar
ID: 9614249
Wow, now somebody writes code for free !
0
 
LVL 31

Expert Comment

by:qwaletee
ID: 9617655
Nope, pulled that out of a test file I created to debug a MIME problem.
0

Featured Post

What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

Join & Write a Comment

Suggested Solutions

This is an old article, please see an updated version of this article, located here: http://www.experts-exchange.com/articles/23619/Notes-8-5x-Windows-7-Notes-info-and-tips.html
  In today’s Arena we can’t imagine our lives without Internet as we are highly used to of it. If we consider our life style just for only 2 min we found that face to face communication is swapped by e-communication.  Every Where from Works place to…
This video discusses moving either the default database or any database to a new volume.
Here's a very brief overview of the methods PRTG Network Monitor (https://www.paessler.com/prtg) offers for monitoring bandwidth, to help you decide which methods you´d like to investigate in more detail.  The methods are covered in more detail in o…

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now